Sound Event Localization and Detection Based on Deep Learning

被引：0

作者：

Zhao, Dada ^{[1
,2
]}

Ding, Kai ^{[2
]}

Qi, Xiaogang ^{[1
]}

Chen, Yu ^{[2
]}

Feng, Hailin ^{[1
]}

机构：

[1] Xidian Univ, Sch Math & Stat, Xian 710071, Peoples R China

[2] Sci & Technol Near Surface Detect Lab, Wuxi 214035, Peoples R China

来源：

JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS | 2024年 / 35卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Location awareness; Feature extraction; Neural networks; Convolutional neural networks; Reverberation; Prediction algorithms; Training; sound event localization and detection (SELD); deep learning; convolutional recursive neural network (CRNN); channel attention mechanism; DATA AUGMENTATION; NEURAL-NETWORKS; SPECTRUM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Acoustic source localization (ASL) and sound event detection (SED) are two widely pursued independent research fields. In recent years, in order to achieve a more complete spatial and temporal representation of sound field, sound event localization and detection (SELD) has become a very active research topic. This paper presents a deep learning-based multi-overlapping sound event localization and detection algorithm in three-dimensional space. Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimension as input features. These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively. The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential information and suppress the useless features. Finally, a thourough comparison confirms the efficiency and effectiveness of the proposed SELD algorithm. Field experiments show that the proposed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy compared with the baseline method.

引用

页码：294 / 301

页数：8

共 50 条

[41] Uncertainty Estimation for Sound Source Localization With Deep Learning
Pi, Rendong
Yu, Xiang
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
[42] A survey of sound source localization with deep learning methods
Grumiaux, Pierre-Amaury
Kitic, Srdan
Girin, Laurent
Guerin, Alexandre
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 152 (01): : 107 - 151
[43] Anomalous sound event detection: A survey of machine learning based methods and applications
Zied Mnasri
Stefano Rovetta
Francesco Masulli
Multimedia Tools and Applications, 2022, 81 : 5537 - 5586
[44] Regression-based Sound Event Detection with Semi-supervised Learning
Liu, Chia-Chuan
Chen, Chia-Ping
Lu, Chung-Li
Chan, Bo-cheng
Cheng, Yu-Han
Chuang, Hsiang-Feng
Chen, Wei-Yu
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2336 - 2342
[45] Anomalous sound event detection: A survey of machine learning based methods and applications
Mnasri, Zied
Rovetta, Stefano
Masulli, Francesco
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 5537 - 5586
[46] A GENERAL NETWORK ARCHITECTURE FOR SOUND EVENT LOCALIZATION AND DETECTION USING TRANSFER LEARNING AND RECURRENT NEURAL NETWORK
Nguyen, Thi Ngoc Tho
Nguyen, Ngoc Khanh
Phan, Huy
Pham, Lam
Ooi, Kenneth
Jones, Douglas L.
Gan, Woon-Seng
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 935 - 939
[47] PEER COLLABORATIVE LEARNING FOR POLYPHONIC SOUND EVENT DETECTION
Endo, Hayato
Nishizaki, Hiromitsu
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 826 - 830
[48] Deep Learning based Beat Event Detection in Action Movie Franchises
Ejaz, N.
Khan, U. A.
Martinez-del-Amor, M. A.
Sparenberg, H.
TENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2017), 2018, 10696
[49] Feature extraction strategies in deep learning based acoustic event detection
Espi, Miguel
Fujimoto, Masakiyo
Kinoshita, Keisuke
Nakatani, Tomohiro
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2922 - 2926
[50] Event Detection Based on Deep Learning Using Audio and Radar Sensors
Kim, Taeho
Noh, Kyoungjin
Kim, Jaeha
Youn, Jeongnam
Chang, Joon-Hyuk
PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 179 - 182

← 1 2 3 4 5 →