Environment Sound Event Classification With a Two-Stream Convolutional Neural Network

被引:26
|
作者
Dong, Xifeng [1 ]
Yin, Bo [1 ,2 ]
Cong, Yanping [1 ]
Du, Zehua [1 ]
Huang, Xianqing [1 ]
机构
[1] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China
[2] Pilot Natl Lab Marine Sci & Technol, Qingdao 266237, Peoples R China
基金
中国国家自然科学基金;
关键词
Environmental sound classification; sound recognition; convolutional neural networks; data processing; pre-emphasis; two stream model; RECOGNITION; REPRESENTATIONS;
D O I
10.1109/ACCESS.2020.3007906
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, with the construction of intelligent cities, the importance of environmental sound classification (ESC) research has become increasingly prominent. However, due to the non-stationary nature of environment sound and the strong interference of ambient noise, the recognition accuracy of ESC is not high enough. Even with deep learning methods, it is difficult to fully extract features from models with a single input. Aiming to improve the recognition accuracy of ESC, this paper proposes a two-stream convolutional neural network (CNN) based on raw audio CNN (RACNN) and logmel CNN (LMCNN). In this method, a pre-emphasis module is first constructed to deal with raw audio signal. The processed audio data and logmel data are imported into RACNN and LMCNN, respectively to obtain both of time and frequency features of audio. In addition, a random-padding method is proposed to patch shorter data sequences. In such a way, the available data for experiment are greatly increased. Finally, the effectiveness of the methods has been verified based on UrbanSound8K dataset in experimental part.
引用
收藏
页码:125714 / 125721
页数:8
相关论文
共 50 条
  • [41] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
    Chen, Jun
    Xu, Yuanping
    Zhang, Chaolong
    Xu, Zhijie
    Meng, Xiangxiang
    Wang, Jie
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140
  • [42] Two-stream graph convolutional neural network fusion for weakly supervised temporal action detection
    Mengyao Zhao
    Zhengping Hu
    Shufang Li
    Shuai Bi
    Zhe Sun
    Signal, Image and Video Processing, 2022, 16 : 947 - 954
  • [43] Improved human action recognition approach based on two-stream convolutional neural network model
    Congcong Liu
    Jie Ying
    Haima Yang
    Xing Hu
    Jin Liu
    The Visual Computer, 2021, 37 : 1327 - 1341
  • [44] Pornographic Video Detection with Convolutional Two-Stream Network Fusion
    Lee, Wonjae
    Kim, Junghak
    Lee, Nam Kyung
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1273 - 1275
  • [45] A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction
    Manli Zhu
    Qianhui Men
    Edmond S. L. Ho
    Howard Leung
    Hubert P. H. Shum
    Journal of Medical Systems, 46
  • [46] Construction and Application of Quality Assessment Model of No Reference Images Two-Stream Convolutional Neural Network
    Kang, Dong
    Informatica (Slovenia), 2024, 48 (15): : 163 - 178
  • [47] A Multi-task Two-stream Spatiotemporal Convolutional Neural Network for Convective Storm Nowcasting
    Zhang, Wei
    Liu, Hongling
    Li, Pengfei
    Han, Lei
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3953 - 3960
  • [48] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
    Gao, Xiuju
    Zhang, Hanling
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269
  • [49] Improving human action recognition with two-stream 3D convolutional neural network
    Van-Minh Khong
    Thanh-Hai Tran
    2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2018,
  • [50] TWO-STREAM HYBRID ATTENTION NETWORK FOR MULTIMODAL CLASSIFICATION
    Chen, Qipin
    Shi, Zhenyu
    Zuo, Zhen
    Fu, Jinmiao
    Sun, Yi
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 359 - 363