Environmental sound recognition using short-time feature aggregation

被引:0
|
作者
Gerard Roma
Perfecto Herrera
Waldo Nogueira
机构
[1] Georgia Institute of Technology,School of Literature, Media and Communication
[2] Universitat Pompeu Fabra,Music Technology Group
[3] Medical University Hannover and Cluster of Excellence Hearing4all,Department of Otolaryngology
关键词
Audio databases; Event detection; Environmental sound recognition; Audio features; Recurrence quantification analysis; Pattern recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Recognition of environmental sound is usually based on two main architectures, depending on whether the model is trained with frame-level features or with aggregated descriptions of acoustic scenes or events. The former architecture is appropriate for applications where target categories are known in advance, while the later affords a less supervised approach. In this paper, we propose a framework for environmental sound recognition based on blind segmentation and feature aggregation. We describe a new set of descriptors, based on Recurrence Quantification Analysis (RQA), which can be extracted from the similarity matrix of a time series of audio descriptors. We analyze their usefulness for recognition of acoustic scenes and events in addition to standard feature aggregation. Our results show the potential of non-linear time series analysis techniques for dealing with environmental sounds.
引用
收藏
页码:457 / 475
页数:18
相关论文
共 50 条
  • [1] Environmental sound recognition using short-time feature aggregation
    Roma, Gerard
    Herrera, Perfecto
    Nogueira, Waldo
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 51 (03) : 457 - 475
  • [2] Short-time acoustic scene recognition method using multi-scale feature fusion
    Wang, Meng
    Zhang, Pengyuan
    Shengxue Xuebao/Acta Acustica, 2022, 47 (06): : 717 - 726
  • [3] On Feature Selection in Environmental Sound Recognition
    Mitrovic, Dalibor
    Zeppelzauer, Matthias
    Eidenberger, Horst
    PROCEEDINGS ELMAR-2009, 2009, : 201 - 204
  • [4] Short-time signal analysis using pattern recognition methods
    Bogus, P
    Lewandowska, KD
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 550 - 555
  • [5] SHORT-TIME SPECTRAL AGGREGATION FOR SPEAKER EMBEDDING
    Tu, Youzhi
    Mak, Man-Wai
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6708 - 6712
  • [6] SOUND AND VIBRATION SIGNAL ANALYSIS USING IMPROVED SHORT-TIME FOURIER REPRESENTATION
    Lee, June-Yule
    INTERNATIONAL JOURNAL OF AUTOMOTIVE AND MECHANICAL ENGINEERING, 2013, 7 : 811 - 819
  • [7] Feature Extraction Using HHT-based Locally Optimized Short-Time Fractional Fourier Transform for Speaker Recognition
    Wang, Jinfang
    Du, Hailong
    Guo, Ming
    Nie, Xinli
    Luan, Shuxin
    Liu, Chang
    2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2017,
  • [8] Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
    Haro, Martin
    Serra, Joan
    Herrera, Perfecto
    Corral, Alvaro
    PLOS ONE, 2012, 7 (03):
  • [9] Short-time Activity Recognition With Wearable Sensors Using Convolutional Neural Network
    Sheng, Min
    Jiang, Jing
    Su, Benyue
    Tang, Qingfeng
    Yahya, Ali Abdullah
    Wang, Guangjun
    PROCEEDINGS VRCAI 2016: 15TH ACM SIGGRAPH CONFERENCE ON VIRTUAL-REALITY CONTINUUM AND ITS APPLICATIONS IN INDUSTRY, 2016, : 413 - 416
  • [10] Frequency Recognition of Short-Time SSVEP Signal Using CORRCA-Based Spatio-Spectral Feature Fusion Framework
    Mahmood, Shabbir
    Shin, Jungpil
    Farhana, Iffat
    Islam, Md. Rabiul
    Molla, Md. Khademul Islam
    IEEE ACCESS, 2021, 9 (09): : 167744 - 167755