Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

被引:1
|
作者
Gourisaria M.K. [1 ]
Agrawal R. [1 ]
Sahni M. [2 ]
Singh P.K. [3 ]
机构
[1] School of Computer Engineering, KIIT Deemed to Be University, Odisha, Bhubaneswar
[2] Department of Mathematics, Pandit Deendayal Energy University, Gujarat, Gandhinagar
[3] Central University of Jammu, Jammu & Kashmir, Bagla Suchani
来源
Discover Internet of Things | 2024年 / 4卷 / 01期
关键词
Artificial Neural Network; Audio Classification; Audio file management; Audio visualization; Automated Systems; Mel Frequency Cepstral Coefficients; Short-Time Fourier Transform;
D O I
10.1007/s43926-023-00049-y
中图分类号
学科分类号
摘要
In the era of automated and digitalized information, advanced computer applications deal with a major part of the data that comprises audio-related information. Advancements in technology have ushered in a new era where cutting-edge devices can deliver comprehensive insights into audio content, leveraging sophisticated algorithms such such as Mel Frequency Cepstral Coefficients (MFCCs) and Short-Time Fourier Transform (STFT) to extract and provide pertinent information. Our study helps in not only efficient audio file management and audio file retrievals but also plays a vital role in security, the robotics industry, and investigations. Beyond its industrial applications, our model exhibits remarkable versatility in the corporate sector, particularly in tasks like siren sound detection and more. Embracing this capability holds the promise of catalyzing the development of advanced automated systems, paving the way for increased efficiency and safety across various corporate domains. The primary aim of our experiment is to focus on creating highly efficient audio classification models that can be seamlessly automated and deployed within the industrial sector, addressing critical needs for enhanced productivity and performance. Despite the dynamic nature of environmental sounds and the presence of noises, our presented audio classification model comes out to be efficient and accurate. The novelty of our research work reclines to compare two different audio datasets having similar characteristics and revolves around classifying the audio signals into several categories using various machine learning techniques and extracting MFCCs and STFTs features from the audio signals. We have also tested the results after and before the noise removal for analyzing the effect of the noise on the results including the precision, recall, specificity, and F1-score. Our experiment shows that the ANN model outperforms the other six audio models with the accuracy of 91.41% and 91.27% on respective datasets. © The Author(s) 2023.
引用
收藏
相关论文
共 50 条
  • [21] A Comparative Analysis of Data sets using Machine Learning Techniques
    Abhilash, C. B.
    Rohitaksha, K.
    Biradar, Shankar
    SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 24 - 29
  • [22] Comparative Analysis of Machine Learning Techniques Using Predictive Modeling
    Khandelwal, Ritu
    Goyal, Hemlata
    Shekhawat, Rajveer S.
    Recent Advances in Computer Science and Communications, 2022, 15 (03) : 466 - 477
  • [23] Emergency Vehicle Classification Using Combined Temporal and Spectral Audio Features with Machine Learning Algorithms
    Jayakumar, Dontabhaktuni
    Krishnaiah, Modugu
    Kollem, Sreedhar
    Peddakrishna, Samineni
    Chandrasekhar, Nadikatla
    Thirupathi, Maturi
    ELECTRONICS, 2024, 13 (19)
  • [24] Comparative Analysis of Network Fault Classification Using Machine Learning
    Kawasaki, Junichi
    Mouri, Genichi
    Suzuki, Yusuke
    NOMS 2020 - PROCEEDINGS OF THE 2020 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2020: MANAGEMENT IN THE AGE OF SOFTWARIZATION AND ARTIFICIAL INTELLIGENCE, 2020,
  • [25] Machine Learning Techniques for Diabetes Classification: A Comparative Study
    Mustafa, Hiri
    Mohamed, Chrayah
    Nabil, Ourdani
    Noura, Aknin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 785 - 790
  • [26] A Comparative Analysis of Machine and Deep Learning Techniques for EEG Evoked Emotion Classification
    Nandini Kumari
    Shamama Anwar
    Vandana Bhattacharjee
    Wireless Personal Communications, 2023, 128 : 2869 - 2890
  • [27] A Comparative Analysis of Machine Learning Techniques for Disaster-Related Tweet Classification
    Kumar, Abhinav
    Singh, Jyoti Prakash
    Saumya, Sunil
    PROCEEDINGS OF 2019 IEEE R10 HUMANITARIAN TECHNOLOGY CONFERENCE (IEEE R10 HTC 2019), 2019, : 222 - 227
  • [28] A Comparative Analysis of Machine and Deep Learning Techniques for EEG Evoked Emotion Classification
    Kumari, Nandini
    Anwar, Shamama
    Bhattacharjee, Vandana
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 128 (04) : 2869 - 2890
  • [29] A Comparative Analysis of Machine Learning Techniques for LULC Classification Using Landsat-8 Satellite Imagery
    Dapke, Pratibha P.
    Quadri, Syed Ahteshamuddin
    Nagare, Samadhan M.
    Bandal, Sagar B.
    Baheti, Manasi R.
    INTERNATIONAL JOURNAL OF ENGINEERING AND GEOSCIENCES, 2025, 10 (01):
  • [30] A Comparative Study in Machine Learning and Audio Features for Kitchen Sounds Recognition
    Manzo-Martinez, Alain
    Gaxiola, Fernando
    Ramirez-Alonso, Graciela
    Martinez-Reyes, Fernando
    COMPUTACION Y SISTEMAS, 2022, 26 (02): : 603 - 621