Sound Event Localization and Detection Based on Deep Learning

被引:0
|
作者
Zhao, Dada [1 ,2 ]
Ding, Kai [2 ]
Qi, Xiaogang [1 ]
Chen, Yu [2 ]
Feng, Hailin [1 ]
机构
[1] Xidian Univ, Sch Math & Stat, Xian 710071, Peoples R China
[2] Sci & Technol Near Surface Detect Lab, Wuxi 214035, Peoples R China
基金
中国国家自然科学基金;
关键词
Location awareness; Feature extraction; Neural networks; Convolutional neural networks; Reverberation; Prediction algorithms; Training; sound event localization and detection (SELD); deep learning; convolutional recursive neural network (CRNN); channel attention mechanism; DATA AUGMENTATION; NEURAL-NETWORKS; SPECTRUM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Acoustic source localization (ASL) and sound event detection (SED) are two widely pursued independent research fields. In recent years, in order to achieve a more complete spatial and temporal representation of sound field, sound event localization and detection (SELD) has become a very active research topic. This paper presents a deep learning-based multi-overlapping sound event localization and detection algorithm in three-dimensional space. Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimension as input features. These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively. The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential information and suppress the useless features. Finally, a thourough comparison confirms the efficiency and effectiveness of the proposed SELD algorithm. Field experiments show that the proposed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy compared with the baseline method.
引用
收藏
页码:294 / 301
页数:8
相关论文
共 50 条
  • [21] Deep reinforcement learning based lane detection and localization
    Zhao, Zhiyuan
    Wang, Qi
    Li, Xuelong
    NEUROCOMPUTING, 2020, 413 : 328 - 338
  • [22] Efficient Sound Event Localization and Detection in the Quaternion Domain
    Brignone, Christian
    Mancini, Gioia
    Grassucci, Eleonora
    Uncini, Aurelio
    Comminiello, Danilo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (05) : 2453 - 2457
  • [23] A Model Ensemble Approach for Sound Event Localization and Detection
    Wang, Qing
    Wu, Huaxin
    Jing, Zijun
    Ma, Feng
    Fang, Yi
    Wang, Yuxuan
    Chen, Tairan
    Pan, Jia
    Du, Jun
    Lee, Chin-Hui
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [24] Sound learning-based event detection for acoustic surveillance sensors
    Park, Jeong-Sik
    Kim, Seok-Hoon
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (23-24) : 16127 - 16139
  • [25] Click-event sound detection in automotive industry using machine/deep learning
    Espinosa, Ricardo
    Ponce, Hiram
    Gutierrez, Sebastian
    APPLIED SOFT COMPUTING, 2021, 108
  • [26] Learning-based Cooperative Sound Event Detection with Edge Computing
    Wang, Jingrong
    Liu, Kaiyang
    Tzanetakis, George
    Pan, Jianping
    2018 IEEE 37TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2018,
  • [27] INCREMENTAL LEARNING ALGORITHM FOR SOUND EVENT DETECTION
    Koh, Eunjeong
    Saki, Fatemeh
    Guo, Yinyi
    Hung, Cheng-Yu
    Visser, Erik
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [28] Sound event classification using deep neural network based transfer learning
    Lim, Hyungjun
    Kim, Myung Jong
    Kim, Hoirin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2016, 35 (02): : 143 - 148
  • [29] Selective-Memory Meta-Learning With Environment Representations for Sound Event Localization and Detection
    Hu, Jinbo
    Cao, Yin
    Wu, Ming
    Kong, Qiuqiang
    Yang, Feiran
    Plumbley, Mark D.
    Yang, Jun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4313 - 4327
  • [30] Abnormal event detection in crowded scenes based on deep learning
    Zhijun Fang
    Fengchang Fei
    Yuming Fang
    Changhoon Lee
    Naixue Xiong
    Lei Shu
    Sheng Chen
    Multimedia Tools and Applications, 2016, 75 : 14617 - 14639