Research on a Deep Learning Method for Speech Recognition

被引:0
|
作者
Xiao, Jia [1 ]
Xiaolin, Sun [1 ]
机构
[1] Artificial Intelligence and Software Engineering, Nanyang Normal University, Nanyang,473061, China
关键词
Audition - Convolution - Deep neural networks - Speech enhancement - Speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Deep convolutional neural network (CNN) has been widely used in speech recognition technology. The model based on deep CNN can effectively improve the quality of human-computer interaction. However, the existing CNN with fixed convolutional kernel size has a disadvantage on extracting data features. It is hard to effectively identify whether the extracted features sufficient or not. As a result, a self-tuning convolutional kernel (STCK) algorithm is proposed to solve the mentioned problem. Firstly, the computational process of STCK algorithm is derived. Then the calculation formula of the convolutional kernel size is obtained. Meanwhile, Bark-spectrum is introduced to extract the spectrogram of speech signal, which is used as the CNN input to adapt to the human hearing. In addition, the data enhancement strategies are proposed, namely frame channel shielding and Bark-band channel shielding. The presented strategies can further improve the generalization ability of the recognition model. The experimental results show that, compared with another two models (the CNN model without STCK algorithm and the CNN model without the data enhancement strategy), the training loss of the proposed method is minimum. And the recognition error rates for the test samples are reduced by 3.9% and 1%, respectively. © (2024), (International Association of Engineers). All Rights Reserved.
引用
收藏
页码:1272 / 1280
相关论文
共 50 条
  • [41] Recognition of English speech - using a deep learning algorithm
    Wang, Shuyan
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [42] DEEP VARIATIONAL FILTER LEARNING MODELS FOR SPEECH RECOGNITION
    Agrawal, Purvi
    Ganapathy, Sriram
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5731 - 5735
  • [43] Survey of Deep Representation Learning for Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Qadir, Junaid
    Schuller, Bjorn
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (02) : 1634 - 1654
  • [44] Deep Learning Analysis Models for Speech and Emotional Recognition
    Wu, Jun
    Zhu, Tianliang
    Yu, Chengtian
    Wang, Chunzhi
    Zhou, Xianjing
    Liu, Hu
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1541 - 1545
  • [45] Deep Learning of Speech Features for Improved Phonetic Recognition
    Lee, Jaehyung
    Lee, Soo-Young
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259
  • [46] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154
  • [47] Dysarthric Speech Recognition Based on Deep Metric Learning
    Takashima, Yuki
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    INTERSPEECH 2020, 2020, : 4796 - 4800
  • [48] Classical and Deep Learning Methods for Speech Command Recognition
    Xie, Jie
    Li, Qijing
    Hu, Kai
    Zhu, Mingying
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2021), 2021, : 41 - 45
  • [49] Evaluating deep learning architectures for Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    NEURAL NETWORKS, 2017, 92 : 60 - 68
  • [50] On Comparison of Deep Learning Architectures for Distant Speech Recognition
    Sustika, Rika
    Yuliani, Asri R.
    Zaenudin, Efendi
    Pardede, Hilman F.
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 17 - 21