Infant cry classification by MFCC feature extraction with MLP and CNN structures

被引:15
|
作者
Abbaskhah, Ahmad [1 ,4 ]
Sedighi, Hamed [2 ,3 ,5 ]
Marvi, Hossein [4 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Sharif, Iran
[2] Beijing Inst Technol, Sch Aerosp & Engn, Beijing, Peoples R China
[3] Shahrood Univ Technol, Fac Mech Engn, Shahrood, Iran
[4] Shahrood Univ Technol, Fac Elect Engn, Shahrood, Iran
[5] Shahrood Univ Technol, Fac Mech Engn, Shahrood 3619995161, Iran
关键词
Infant cry; Mel-frequency Cepstral Coefficient; Multilayer perceptron; Support vector machine; Convolutional neural network; SMOTE; Classification; IDENTIFICATION;
D O I
10.1016/j.bspc.2023.105261
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this study, Dunstan's infant cry data set is pre-processed with the feature vector approach, including MFCC (19 features) and energy (one feature). By using extracted features and Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Convolutional Neural Network (CNN) classifiers, five classes of infant cry ("Neh" = hungry; "Eh" = need to burp; "Owh" = tired; "Eairh" = stomach cramp; "Heh" = physical discomfort) are distinguished. The proposed MLP and CNN structures are analyzed according to the loss and the accuracy based on the epoch; moreover, to evaluate the performance of classifiers AUC-ROC, Confusion matrix, accuracy, f1_score, recall, and precision have been used. All three classifiers are analyzed, and their results show that the CNN-designed model has the best performance. Results show that the performance will improve by increasing the complexity of the model. With this approach, classifiers are run 10 times, and the average accuracy for SVM for SMOTE and non-SMOTE data are obtained with tolerance 0.823 +/- 0.02, 0.861 +/- 0.02, respectively. These accuracies for MLP are 0.876 +/- 0.01, 0.892 +/- 0.01, and finally, for CNN, are 0.921 +/- 0.005, 0.911 +/- 0.005. At the best condition, an accuracy of 92.1 % is obtained for five classes of infant cries by the proposed CNN structure.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Classification and Recognition of Baby Cry Signal Feature Extraction Based on Improved MFCC
    Chen, Zhenjiang
    Peng, Yizhun
    Li, Di
    Yang, Zhou
    Wang, Nana
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 556 - 559
  • [2] Enhancing Audio Classification Through MFCC Feature Extraction and Data Augmentation with CNN and RNN Models
    Rezaul, Karim Mohammed
    Jewel, Md
    Islam, Md Shabiul
    Siddiquee, Kazy Noor E. Alam
    Barua, Nick
    Rahman, Muhammad Azizur
    Shan-A-Khuda, Mohammad
    Bin Sulaiman, Rejwan
    Shaikh, Md Sadeque Imam
    Hamim, Md Abrar
    Tanmoy, F. M.
    Ul Haque, Afraz
    Nipun, Musarrat Saberin
    Dorudian, Navid
    Kareem, Amer
    Farid, Ahmed Khondokar
    Mubarak, Asma
    Jannat, Tajnuva
    Asha, Umme Fatema Tuj
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 37 - 53
  • [3] Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification
    Pusuluri, Aditya
    Kachhi, Aastha
    Patil, Hemant A.
    SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 590 - 603
  • [4] Feature Set Optimisation for Infant Cry Classification
    Vignolo, Leandro D.
    Marcelo Albornoz, Enrique
    Ernesto Martinez, Cesar
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2018, 2018, 11238 : 455 - 466
  • [5] DWT/MFCC Feature Extraction for Tile Tapping Sound Classification
    Panyavaraporn, Jantana
    Limsupreeyarat, Petcharat
    Horkaew, Paramate
    INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2020, 12 (03): : 122 - 130
  • [6] Classification and Recognition of Underwater Target Based on MFCC Feature Extraction
    Tong, Yuze
    Zhang, Xin
    Ge, Yizhou
    2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020), 2020,
  • [7] A CNN Model for Gas Pipeline Leakage Detection Based on MFCC Feature Extraction
    Sun, Chen
    Wan, Yujie
    Zhu, Peizhi
    Lin, Fanqiang
    PROCEEDINGS OF 2023 THE 12TH INTERNATIONAL CONFERENCE ON NETWORKS, COMMUNICATION AND COMPUTING, ICNCC 2023, 2023, : 288 - 293
  • [8] Arabic Speech Recognition Using MFCC Feature Extraction and ANN Classification
    Wahyuni, Elvira Sukma
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 22 - 25
  • [9] Performance Comparison between Mutative and Constriction PSO in Optimizing MFCC for the Classification of Hypothyroid Infant Cry
    Zabidi, A.
    Mansor, W.
    Lee, Y. K.
    Yassin, I. M.
    Sahak, R.
    5TH KUALA LUMPUR INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING 2011 (BIOMED 2011), 2011, 35 : 542 - +
  • [10] Speaker Identification Using MFCC Feature Extraction ANN Classification Technique
    Singh, Mahesh K.
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 136 (01) : 453 - 467