Multi-Scale and Single-Scale Fully Convolutional Networks for Sound Event Detection

被引:0
|
作者
Wang Y. [1 ]
Zhao G. [1 ]
Xiong K. [1 ]
Shi G. [1 ]
Zhang Y. [1 ]
机构
[1] School of Artificial Intelligence, Xidian University, Xi'an, 710071, Shaanxi
来源
Neurocomputing | 2021年 / 421卷
关键词
Dilated convolution; Multi-Scale Fully Convolutional Networks; Single-Scale Fully Convolutional Networks; Sound Event Detection; Temporal dependencies;
D O I
10.1016/j.neucom.2020.09.038
中图分类号
学科分类号
摘要
Among various Sound Event Detection (SED) systems, Recurrent Neural Networks (RNN), such as long short-term memory unit and gated recurrent unit, is used to capture temporal dependencies, but it is confined in its length of temporal dependencies, resulting in a failure to model sound events with long duration. What's more, RNN is incapable to process datasets in parallel, leading to low efficiency and low industrial value. Given these shortcomings, we propose to use dilated convolution (and causal dilated convolution) to capture temporal dependencies, as its great ability to ensure high time resolution and obtain longer temporal dependencies under the filter size and the network depth unchanged. In addition, dilated convolution can be parallelized, so it has higher efficiency and industrial value. Based on this, we propose Single-Scale Fully Convolutional Networks (SS-FCN) composed of convolutional neural networks and dilated convolutional networks, with the former to provide frequency invariance and the later to capture temporal dependencies. With the help of dilated convolution to control the length of temporal dependencies, we observe SS-FCN modeling a single length of temporal dependencies achieves superior detection performance for finite kinds of events. For better performance, we propose Multi-Scale Fully Convolutional Networks (MS-FCN), in which the feature fusion module is introduced to capture long short-term dependencies by fusing features with different length of temporal dependencies. The proposed methods achieve competitive performance on three main datasets with higher efficiency. The results show that SED systems based on Fully Convolutional Networks have further research value and potential. © 2020 Elsevier B.V.
引用
收藏
页码:51 / 65
页数:14
相关论文
共 50 条
  • [31] Co-saliency Detection via Mask-guided Fully Convolutional Networks with Multi-scale Label Smoothing
    Zhang, Kaihua
    Li, Tengpeng
    Liu, Bo
    Liu, Qingshan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3090 - 3099
  • [32] Multi illumination color constancy based on multi-scale supervision and single-scale estimation cascade convolution neural network
    Wang, Fei
    Wang, Wei
    Wu, Dan
    Gao, Guowang
    Wang, Zetian
    FRONTIERS IN NEUROINFORMATICS, 2022, 16
  • [33] LOROD: Fully Convolutional Network for Real-time Multi-scale Object Detection Algorithm
    Hou, Shaoqi
    Li, Chao
    Liu, Xueting
    Zeng, Yuhao
    Du, Wenyi
    Yin, Guangqiang
    2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021), 2021, : 579 - 584
  • [34] Detecting and Classifying Nuclei Using Multi-Scale Fully Convolutional Network
    Xin, Bin
    Yang, Yaning
    Xie, Xiaolan
    Shang, Jiandong
    Liu, Zhengyu
    Peng, Shaoliang
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2022, 29 (10) : 1095 - 1103
  • [35] MFC: A MULTI-SCALE FULLY CONVOLUTIONAL APPROACH FOR VISUAL INSTANCE RETRIEVAL
    Hao, Jiedong
    Wang, Wei
    Dong, Jing
    Tan, Tieniu
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [36] A Multi-scale Pyramid of 3D Fully Convolutional Networks for Abdominal Multi-organ Segmentation
    Roth, Holger R.
    Shen, Chen
    Oda, Hirohisa
    Sugino, Takaaki
    Oda, Masahiro
    Hayashi, Yuichiro
    Misawa, Kazunari
    Mori, Kensaku
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2018, PT IV, 2018, 11073 : 417 - 425
  • [37] Multi-scale Community Detection in Complex Networks
    Ambika, P.
    Rajan, Binu M. R.
    2016 INTERNATIONAL CONFERENCE ON RESEARCH ADVANCES IN INTEGRATED NAVIGATION SYSTEMS (RAINS), 2016,
  • [38] Multi-Scale Event Detection in Financial Time Series
    de Salles, Diego Silva
    Gea, Cristiane
    Mello, Carlos E.
    Assis, Laura
    Coutinho, Rafaelli
    Bezerra, Eduardo
    Ogasawara, Eduardo
    COMPUTATIONAL ECONOMICS, 2025, 65 (01) : 211 - 239
  • [39] Single Image Dehazing via Multi-scale Convolutional Neural Networks with Holistic Edges
    Wenqi Ren
    Jinshan Pan
    Hua Zhang
    Xiaochun Cao
    Ming-Hsuan Yang
    International Journal of Computer Vision, 2020, 128 : 240 - 259
  • [40] Single Image Dehazing via Multi-scale Convolutional Neural Networks with Holistic Edges
    Ren, Wenqi
    Pan, Jinshan
    Zhang, Hua
    Cao, Xiaochun
    Yang, Ming-Hsuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (01) : 240 - 259