Convolutional Neural Networks for Scops Owl Sound Classification

被引:25
|
作者
Hidayat, Alam Ahmad [1 ]
Cenggoro, Tjeng Wawan [1 ,2 ]
Pardamean, Bens [1 ,3 ]
机构
[1] Bina Nusantara Univ, Bioinformat & Data Sci Res Ctr, Jakarta 11480, Indonesia
[2] Bina Nusantara Univ, Sch Comp Sci, Comp Sci Dept, Jakarta 11480, Indonesia
[3] Bina Nusantara Univ, Comp Sci Dept, BINUS Grad Program Master Comp Sci, Jakarta 11480, Indonesia
关键词
acoustic features; bird sound classification; convolutional neural network; mean average precision; scops owl;
D O I
10.1016/j.procs.2021.12.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adopting a deep learning model into bird sound classification tasks becomes a common practice in order to construct a robust automated bird sound detection system. In this paper, we employ a four-layer Convolutional Neural Network (CNN) formulated to classify different species of Indonesia scops owls based on their vocal sounds. Two widely used representations of an acoustic signal: log-scaled mel-spectrogram and Mel Frequency Cepstral Coefficient (MFCC) are extracted from each sound file and fed into the network separately to compare the model performance with different inputs. A more complex CNN that can simultaneously process the two extracted acoustic representations is proposed to provide a direct comparison with the baseline model. The dual-input network is the well-performing model in our experiment that achieves 97.55% Mean Average Precision (MAP). Meanwhile, the baseline model achieves a MAP score of 94.36% for the mel-spectrogram input and 96.08% for the MFCC input. (C) 2021 The Authors. Published by Elsevier B.V.
引用
收藏
页码:81 / 87
页数:7
相关论文
共 50 条
  • [1] Sound Classification Using Convolutional Neural Networks
    Jaiswal, Kaustumbh
    Patel, Dhairya Kalpeshbhai
    2018 SEVENTH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2018, : 81 - 84
  • [2] ENVIRONMENTAL SOUND CLASSIFICATION WITH CONVOLUTIONAL NEURAL NETWORKS
    Piczak, Karol J.
    2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2015,
  • [3] Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
    Salamon, Justin
    Bello, Juan Pablo
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (03) : 279 - 283
  • [4] Technical Sound Event Classification Applying Recurrent and Convolutional Neural Networks
    Rieder, Constantin
    Germann, Markus
    Mezger, Samuel
    Scherer, Klaus
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2020, : 84 - 88
  • [5] AGRICULTURAL HARVESTER SOUND CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORKS AND SPECTROGRAMS
    Khorasani, Nioosha E.
    Thomas, Gabriel
    Balocco, Simone
    Mann, Danny
    APPLIED ENGINEERING IN AGRICULTURE, 2022, 38 (02) : 455 - 459
  • [6] Lung Sound Classification Using Snapshot Ensemble of Convolutional Neural Networks
    Truc Nguyen
    Pernkopf, Franz
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 760 - 763
  • [7] Eating Sound Dataset for 20 Food Types and Sound Classification Using Convolutional Neural Networks
    Ma, Jeannette Shijie
    Maureira, Marcello A. Gomez
    van Rijn, Jan N.
    COMPANION PUBLICATON OF THE 2020 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (ICMI '20 COMPANION), 2020, : 348 - 351
  • [8] Environmental Sound Classification using Deep Convolutional Neural Networks and Data Augmentation
    Davis, Nithya
    Suresh, K.
    2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 41 - 45
  • [9] Performance comparison of lung sound classification using various convolutional neural networks
    Kim, Gee Yeun
    Kim, Hyoung-Gook
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (05): : 568 - 573
  • [10] Convolutional Recurrent Neural Networks for Urban Sound Classification using Raw Waveforms
    Sang, Jonghee
    Park, Soomyung
    Lee, Junwoo
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2444 - 2448