Environmental sound recognition using short-time feature aggregation

被引：0

作者：

Gerard Roma

Perfecto Herrera

Waldo Nogueira

机构：

[1] Georgia Institute of Technology,School of Literature, Media and Communication

[2] Universitat Pompeu Fabra,Music Technology Group

[3] Medical University Hannover and Cluster of Excellence Hearing4all,Department of Otolaryngology

来源：

Journal of Intelligent Information Systems | 2018年 / 51卷

关键词：

Audio databases; Event detection; Environmental sound recognition; Audio features; Recurrence quantification analysis; Pattern recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recognition of environmental sound is usually based on two main architectures, depending on whether the model is trained with frame-level features or with aggregated descriptions of acoustic scenes or events. The former architecture is appropriate for applications where target categories are known in advance, while the later affords a less supervised approach. In this paper, we propose a framework for environmental sound recognition based on blind segmentation and feature aggregation. We describe a new set of descriptors, based on Recurrence Quantification Analysis (RQA), which can be extracted from the similarity matrix of a time series of audio descriptors. We analyze their usefulness for recognition of acoustic scenes and events in addition to standard feature aggregation. Our results show the potential of non-linear time series analysis techniques for dealing with environmental sounds.

引用

页码：457 / 475

页数：18

共 50 条

[1] Environmental sound recognition using short-time feature aggregation
Roma, Gerard
Herrera, Perfecto
Nogueira, Waldo
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 51 (03) : 457 - 475
[2] Short-time acoustic scene recognition method using multi-scale feature fusion
Wang, Meng
Zhang, Pengyuan
Shengxue Xuebao/Acta Acustica, 2022, 47 (06): : 717 - 726
[3] On Feature Selection in Environmental Sound Recognition
Mitrovic, Dalibor
Zeppelzauer, Matthias
Eidenberger, Horst
PROCEEDINGS ELMAR-2009, 2009, : 201 - 204
[4] Short-time signal analysis using pattern recognition methods
Bogus, P
Lewandowska, KD
ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 550 - 555
[5] SHORT-TIME SPECTRAL AGGREGATION FOR SPEAKER EMBEDDING
Tu, Youzhi
Mak, Man-Wai
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6708 - 6712
[6] SOUND AND VIBRATION SIGNAL ANALYSIS USING IMPROVED SHORT-TIME FOURIER REPRESENTATION
Lee, June-Yule
INTERNATIONAL JOURNAL OF AUTOMOTIVE AND MECHANICAL ENGINEERING, 2013, 7 : 811 - 819
[7] Feature Extraction Using HHT-based Locally Optimized Short-Time Fractional Fourier Transform for Speaker Recognition
Wang, Jinfang
Du, Hailong
Guo, Ming
Nie, Xinli
Luan, Shuxin
Liu, Chang
2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2017,
[8] Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
Haro, Martin
Serra, Joan
Herrera, Perfecto
Corral, Alvaro
PLOS ONE, 2012, 7 (03):
[9] Short-time Activity Recognition With Wearable Sensors Using Convolutional Neural Network
Sheng, Min
Jiang, Jing
Su, Benyue
Tang, Qingfeng
Yahya, Ali Abdullah
Wang, Guangjun
PROCEEDINGS VRCAI 2016: 15TH ACM SIGGRAPH CONFERENCE ON VIRTUAL-REALITY CONTINUUM AND ITS APPLICATIONS IN INDUSTRY, 2016, : 413 - 416
[10] Frequency Recognition of Short-Time SSVEP Signal Using CORRCA-Based Spatio-Spectral Feature Fusion Framework
Mahmood, Shabbir
Shin, Jungpil
Farhana, Iffat
Islam, Md. Rabiul
Molla, Md. Khademul Islam
IEEE ACCESS, 2021, 9 (09): : 167744 - 167755

← 1 2 3 4 5 →