Environment Sound Event Classification With a Two-Stream Convolutional Neural Network

被引：26

作者：

Dong, Xifeng ^{[1
]}

Yin, Bo ^{[1
,2
]}

Cong, Yanping ^{[1
]}

Du, Zehua ^{[1
]}

Huang, Xianqing ^{[1
]}

机构：

[1] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China

[2] Pilot Natl Lab Marine Sci & Technol, Qingdao 266237, Peoples R China

来源：

IEEE ACCESS | 2020年 / 8卷

基金：

中国国家自然科学基金;

关键词：

Environmental sound classification; sound recognition; convolutional neural networks; data processing; pre-emphasis; two stream model; RECOGNITION; REPRESENTATIONS;

D O I：

10.1109/ACCESS.2020.3007906

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, with the construction of intelligent cities, the importance of environmental sound classification (ESC) research has become increasingly prominent. However, due to the non-stationary nature of environment sound and the strong interference of ambient noise, the recognition accuracy of ESC is not high enough. Even with deep learning methods, it is difficult to fully extract features from models with a single input. Aiming to improve the recognition accuracy of ESC, this paper proposes a two-stream convolutional neural network (CNN) based on raw audio CNN (RACNN) and logmel CNN (LMCNN). In this method, a pre-emphasis module is first constructed to deal with raw audio signal. The processed audio data and logmel data are imported into RACNN and LMCNN, respectively to obtain both of time and frequency features of audio. In addition, a random-padding method is proposed to patch shorter data sequences. In such a way, the available data for experiment are greatly increased. Finally, the effectiveness of the methods has been verified based on UrbanSound8K dataset in experimental part.

引用

页码：125714 / 125721

页数：8

共 50 条

[41] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
Chen, Jun
Xu, Yuanping
Zhang, Chaolong
Xu, Zhijie
Meng, Xiangxiang
Wang, Jie
2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140
[42] Two-stream graph convolutional neural network fusion for weakly supervised temporal action detection
Mengyao Zhao
Zhengping Hu
Shufang Li
Shuai Bi
Zhe Sun
Signal, Image and Video Processing, 2022, 16 : 947 - 954
[43] Improved human action recognition approach based on two-stream convolutional neural network model
Congcong Liu
Jie Ying
Haima Yang
Xing Hu
Jin Liu
The Visual Computer, 2021, 37 : 1327 - 1341
[44] Pornographic Video Detection with Convolutional Two-Stream Network Fusion
Lee, Wonjae
Kim, Junghak
Lee, Nam Kyung
11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1273 - 1275
[45] A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction
Manli Zhu
Qianhui Men
Edmond S. L. Ho
Howard Leung
Hubert P. H. Shum
Journal of Medical Systems, 46
[46] Construction and Application of Quality Assessment Model of No Reference Images Two-Stream Convolutional Neural Network
Kang, Dong
Informatica (Slovenia), 2024, 48 (15): : 163 - 178
[47] A Multi-task Two-stream Spatiotemporal Convolutional Neural Network for Convective Storm Nowcasting
Zhang, Wei
Liu, Hongling
Li, Pengfei
Han, Lei
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3953 - 3960
[48] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
Gao, Xiuju
Zhang, Hanling
PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269
[49] Improving human action recognition with two-stream 3D convolutional neural network
Van-Minh Khong
Thanh-Hai Tran
2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2018,
[50] TWO-STREAM HYBRID ATTENTION NETWORK FOR MULTIMODAL CLASSIFICATION
Chen, Qipin
Shi, Zhenyu
Zuo, Zhen
Fu, Jinmiao
Sun, Yi
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 359 - 363

← 1 2 3 4 5 →