Classical and Deep Learning Methods for Speech Command Recognition

被引:2
|
作者
Xie, Jie [1 ]
Li, Qijing [1 ]
Hu, Kai [1 ]
Zhu, Mingying [2 ]
机构
[1] Jiangnan Univ, Sch Internet Things Engn, Wuxi, Jiangsu, Peoples R China
[2] Nanjing Univ, Sch Econ, Wuxi, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
speech command recognition; convolutional neural networks; acoustic feature;
D O I
10.1109/ICICN52636.2021.9673813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an application area of speech command recognition, smart home has provided people a convenient way to communicate with various digital devices. In this study, we aim to investigate both machine learning and deep learning architectures for improved speaker-independent speech command recognition. First, we extract statistical MFCCs vectors to train classical machine learning models: KNN, SVM, and RF. Second, we trained deep learning models using two end-to-end architectures with different inputs. Experimental results indicate that our presented method achieved the highest accuracy and F1 score of 0.846 +/- 0.148 and 0.84 +/- 0.157 on the private dataset.
引用
收藏
页码:41 / 45
页数:5
相关论文
共 50 条
  • [41] Deep Learning of Speech Features for Improved Phonetic Recognition
    Lee, Jaehyung
    Lee, Soo-Young
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259
  • [42] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154
  • [43] Dysarthric Speech Recognition Based on Deep Metric Learning
    Takashima, Yuki
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    INTERSPEECH 2020, 2020, : 4796 - 4800
  • [44] Evaluating deep learning architectures for Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    NEURAL NETWORKS, 2017, 92 : 60 - 68
  • [45] On Comparison of Deep Learning Architectures for Distant Speech Recognition
    Sustika, Rika
    Yuliani, Asri R.
    Zaenudin, Efendi
    Pardede, Hilman F.
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 17 - 21
  • [46] Applications of Deep Learning Approaches in Speech Recognition: A Survey
    Al-Janabi, Sameer I. Ali
    Lateef, Ali Azawii Abdul
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 189 - 196
  • [47] SPEECH EMOTION RECOGNITION-A DEEP LEARNING APPROACH
    Asiya, U. A.
    Kiran, V. K.
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 867 - 871
  • [48] Kannada Continuous Speech Recognition Using Deep Learning
    Paul, Shubhojeet
    Bhattacharjee, Vandana
    Saha, Sujan Kumar
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT IV, 2024, 2093 : 258 - 269
  • [49] Lightweight Deep Learning Framework for Speech Emotion Recognition
    Akinpelu, Samson
    Viriri, Serestina
    Adegun, Adekanmi
    IEEE ACCESS, 2023, 11 : 77086 - 77098
  • [50] Deep Learning Techniques for Speech Emotion Recognition : A Review
    Pandey, Sandeep Kumar
    Shekhawat, H. S.
    Prasanna, S. R. M.
    2019 29TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2019, : 197 - 202