Infant cry classification by MFCC feature extraction with MLP and CNN structures

被引:15
|
作者
Abbaskhah, Ahmad [1 ,4 ]
Sedighi, Hamed [2 ,3 ,5 ]
Marvi, Hossein [4 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Sharif, Iran
[2] Beijing Inst Technol, Sch Aerosp & Engn, Beijing, Peoples R China
[3] Shahrood Univ Technol, Fac Mech Engn, Shahrood, Iran
[4] Shahrood Univ Technol, Fac Elect Engn, Shahrood, Iran
[5] Shahrood Univ Technol, Fac Mech Engn, Shahrood 3619995161, Iran
关键词
Infant cry; Mel-frequency Cepstral Coefficient; Multilayer perceptron; Support vector machine; Convolutional neural network; SMOTE; Classification; IDENTIFICATION;
D O I
10.1016/j.bspc.2023.105261
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this study, Dunstan's infant cry data set is pre-processed with the feature vector approach, including MFCC (19 features) and energy (one feature). By using extracted features and Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Convolutional Neural Network (CNN) classifiers, five classes of infant cry ("Neh" = hungry; "Eh" = need to burp; "Owh" = tired; "Eairh" = stomach cramp; "Heh" = physical discomfort) are distinguished. The proposed MLP and CNN structures are analyzed according to the loss and the accuracy based on the epoch; moreover, to evaluate the performance of classifiers AUC-ROC, Confusion matrix, accuracy, f1_score, recall, and precision have been used. All three classifiers are analyzed, and their results show that the CNN-designed model has the best performance. Results show that the performance will improve by increasing the complexity of the model. With this approach, classifiers are run 10 times, and the average accuracy for SVM for SMOTE and non-SMOTE data are obtained with tolerance 0.823 +/- 0.02, 0.861 +/- 0.02, respectively. These accuracies for MLP are 0.876 +/- 0.01, 0.892 +/- 0.01, and finally, for CNN, are 0.921 +/- 0.005, 0.911 +/- 0.005. At the best condition, an accuracy of 92.1 % is obtained for five classes of infant cries by the proposed CNN structure.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Feature Extraction System during Bolt Tightening Work by Regression CNN and Classification CNN
    Murai, Koichi
    Imai, Tetsuo
    Arai, Kenichi
    Kobayashi, Toru
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2021,
  • [22] Spectrogram analysis of ECG signal and classification efficiency using MFCC feature extraction technique
    Arpitha, Yalamanchili
    Madhumathi, G. L.
    Balaji, N.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 13 (02) : 757 - 767
  • [23] Infant's Cry Sound Classification using Mel-Frequency Cepstrum Coefficients Feature Extraction and Backpropagation Neural Network
    Rosita, Yesy Diah
    Junaedi, Hartarto
    2016 2ND INTERNATIONAL CONFERENCE ON SCIENCE AND TECHNOLOGY-COMPUTER (ICST), 2016,
  • [24] A Method Combining CNN and ELM for Feature Extraction and Classification of SAR Image
    Wang, Peng
    Zhang, Xiaomin
    Hao, Yan
    JOURNAL OF SENSORS, 2019, 2019
  • [25] Text Feature Extraction and Classification Based on Convolutional Neural Network (CNN)
    Zhang, Taohong
    Li, Cunfang
    Cao, Nuan
    Ma, Rui
    Zhang, ShaoHua
    Ma, Nan
    DATA SCIENCE, PT 1, 2017, 727 : 472 - 485
  • [26] TB-MFCC multifuse feature for emergency vehicle sound classification using multistacked CNN - Attention BiLSTM
    Nithya, T. M.
    Dhivya, P.
    Sangeethaa, S. N.
    Kanna, P. Rajesh
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [27] Cardiac sound classification using a hybrid approach: MFCC-based feature fusion and CNN deep features
    Bahreini, Mahbubeh
    Barati, Ramin
    Kamali, Abbas
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01):
  • [28] Data Augmentation for Infant Cry Classification
    Kachhi, Aastha
    Chaturvedi, Shreya
    Patil, Hemant A.
    Singh, Dipesh Kumar
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 433 - 437
  • [29] A review of infant cry analysis and classification
    Ji, Chunyan
    Mudiyanselage, Thosini Bamunu
    Gao, Yutong
    Pan, Yi
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [30] Automatic Methods for Infant Cry Classification
    Banica, Ioana-Alina
    Cucu, Horia
    Buzo, Andi
    Burileanu, Dragos
    Burileanu, Corneliu
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM 2016), 2016, : 51 - 54