Classification of Environmental Sounds with Convolutional Neural Networks

被引:0
|
作者
Dincer, Yalcin [1 ]
Inik, Ozkan [2 ]
机构
[1] Bingol Univ, Tekn Bilimler Meslek Yuksekokulu, Bilgisayar Teknol Bolumu, Bingol, Turkiye
[2] Tokat Gaziosmanpasa Univ, Muhendislik & Mimarlik Fak, Bilgisayar Muhendisligi Bolumu, Tokat, Turkiye
来源
KONYA JOURNAL OF ENGINEERING SCIENCES | 2023年 / 11卷 / 02期
关键词
Deep Learning; Convolutional Neural Network; Environmental Sound Classification; ESC10; UrbanSound8K; SURVEILLANCE; MATRIX; RECOGNITION;
D O I
10.36306/konjes.1201558
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The use of sound data is critical for predicting the effects of environmental activities and gathering information about the environment of these activities. Sound data is utilized to obtain basic information about the functioning of urban activities such as noise pollution, security systems, health care, and local services. In this sense, Environmental Sound Classification (ESC) is becoming critical. Due to the increasing amount of data and time constraints in analysis, there is a need for new and powerful artificial intelligence methods that enable instant automatic identification of sounds. These methods can be developed with Convolutional Neural Networks (CNN) models, which have achieved high accuracy rates in other fields. For this reason, in this study, a new CNN based method is proposed for the classification of two different CSR datasets. In this method, the sounds are first converted into image format. Then, novel ESA models are designed for the classification of these sounds in image format. For each dataset, the ESA models with the highest accuracy rate were obtained among the multiple ESA models designed. The datasets used in the study are ESC10 and UrbanSound8K, respectively. The sound recordings in these datasets were converted to image format with 32x32x3 and 224x224x3 dimensions, and four different image format datasets were obtained. The CNN models developed to classify these datasets are named ESC10_ESA32, ESC10_ESA224, URBANSOUND8K_ESA32, and URBANSOUND8K_ESA224, respectively. These models were trained on the datasets using 10-fold cross-validation. In the obtained results, the average accuracy rates of the ESC10_ESA32, ESC10_ESA224, URBANSOUND8K_ESA32, and URBANSOUND8K_ESA224 models are 80.75%, 82.25%, 88.60%, and 84.33%, respectively. When these results are compared with other baseline studies in the literature on the same datasets, it is seen that these models achieve better results.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Convolutional Neural Networks for event classification
    Rubio Jimenez, Adrian
    Garcia Navarro, Jose Enrique
    Moreno Llacer, Maria
    NINTH ANNUAL CONFERENCE ON LARGE HADRON COLLIDER PHYSICS, LHCP2021, 2021,
  • [22] Convolutional Neural Networks for image classification
    Jmour, Nadia
    Zayen, Sehla
    Abdelkrim, Afef
    2018 INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND ELECTRICAL TECHNOLOGIES (IC_ASET), 2017, : 397 - 402
  • [23] Flower Classification with Convolutional Neural Networks
    Mitrovic, Katarina
    Milosevic, Danijela
    2019 23RD INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2019, : 845 - 850
  • [24] Convolutional Neural Networks for Electrocardiogram Classification
    Mohamad M. Al Rahhal
    Yakoub Bazi
    Mansour Al Zuair
    Esam Othman
    Bilel BenJdira
    Journal of Medical and Biological Engineering, 2018, 38 : 1014 - 1025
  • [25] Seabed Classification Using a Convolutional Neural Network on Explosive Sounds
    Howarth, Kira
    Neilsen, Tracianne B.
    Van Komen, David F.
    Knobles, David Paul
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2022, 47 (03) : 670 - 679
  • [26] Convolutional Neural Networks for Electrocardiogram Classification
    Al Rahhal, Mohamad M.
    Bazi, Yakoub
    Al Zuair, Mansour
    Othman, Esam
    BenJdira, Bilel
    JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2018, 38 (06) : 1014 - 1025
  • [27] Glomerulus Classification with Convolutional Neural Networks
    Pedraza, Anibal
    Gallego, Jaime
    Lopez, Samuel
    Gonzalez, Lucia
    Laurinavicius, Arvydas
    Bueno, Gloria
    MEDICAL IMAGE UNDERSTANDING AND ANALYSIS (MIUA 2017), 2017, 723 : 839 - 849
  • [28] Convolutional Neural Networks for ATC Classification
    Lumini, Alessandra
    Nanni, Loris
    CURRENT PHARMACEUTICAL DESIGN, 2018, 24 (34) : 4007 - 4012
  • [29] Classification of Phonocardiograms with Convolutional Neural Networks
    Deperlioglu, Omer
    BRAIN-BROAD RESEARCH IN ARTIFICIAL INTELLIGENCE AND NEUROSCIENCE, 2018, 9 (02): : 22 - 33
  • [30] Convolutional Neural Networks for Font Classification
    Tensmeyer, Chris
    Saunders, Daniel
    Martinez, Tony
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 985 - 990