English speech recognition based on deep learning with multiple features

被引:2
|
作者
Zhaojuan Song
机构
[1] School of Translation Studies of Qufu Normal University,
来源
Computing | 2020年 / 102卷
关键词
Deep neural network; Fusion; Speech recognition; Multiple features; 68T10; 68T35; 68T50;
D O I
暂无
中图分类号
学科分类号
摘要
English is one of the widely used languages, with the shrinking of the global village, the smart home, the in-vehicle voice system and voice recognition software with English as the recognition language have gradually entered people’s field of vision, and have obtained the majority of users’ love by the practical accuracy. And deep learning technology in many tasks with its hierarchical feature learning ability and data modeling capabilities has achieved more than the performance of shallow learning technology. Therefore, this paper takes English speech as the research object, and proposes a deep learning speech recognition algorithm that combines speech features and speech attributes. Firstly, the deep neural network supervised learning method is used to extract the high-level features of the speech, select the output of the fixed hidden layer as the new speech feature for the newly generated network, and train the GMM–HMM acoustic model with the new speech features; secondly, the speech attribute extractor based on deep neural network is trained for multiple speech attributes, and the extracted speech attributes are classified into phoneme by deep neural network; finally, speech features and speech attribute features are merged into the same CNN framework by the neural network based on the linear feature fusion algorithm. The experimental results show that the proposed English speech recognition algorithm based on deep neural network with multiple features can directly and effectively combine the two methods by combining the speech features and the speech attributes of the speaker in the input layer of the deep neural network, and it can improve the performance of the English speech recognition system significantly.
引用
收藏
页码:663 / 682
页数:19
相关论文
共 50 条
  • [21] Pattern recognition and features selection for speech emotion recognition model using deep learning
    Kittisak Jermsittiparsert
    Abdurrahman Abdurrahman
    Parinya Siriattakul
    Ludmila A. Sundeeva
    Wahidah Hashim
    Robbi Rahim
    Andino Maseleno
    International Journal of Speech Technology, 2020, 23 : 799 - 806
  • [22] Pattern recognition and features selection for speech emotion recognition model using deep learning
    Jermsittiparsert, Kittisak
    Abdurrahman, Abdurrahman
    Siriattakul, Parinya
    Sundeeva, Ludmila A.
    Hashim, Wahidah
    Rahim, Robbi
    Maseleno, Andino
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 799 - 806
  • [23] SPEECH RECOGNITION FEATURES BASED ON DEEP LATENT GAUSSIAN MODELS
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [24] Deep-Sparse-Representation-Based Features for Speech Recognition
    Sharma, Pulkit
    Abrol, Vinayak
    Sao, Anil Kumar
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2162 - 2175
  • [25] Research on English language learning algorithm based on speech recognition
    Liu, Jinping, 1600, TeknoScienze, Viale Brianza,22, Milano, 20127, Italy (28):
  • [26] Research on English Language Learning Algorithm Based on Speech Recognition
    Liu, Jinping
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 2653 - 2656
  • [27] Learning Deep Features on Multiple Scales for Coffee Crop Recognition
    Baeta, Rafael
    Nogueira, Keiller
    Menotti, David
    dos Santos, Jefersson A.
    2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 262 - 268
  • [28] Multilayer deep features with multiple kernel learning for action recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Yang, Wankou
    NEUROCOMPUTING, 2020, 399 : 65 - 74
  • [29] Automated English Speech Recognition Using Dimensionality Reduction with Deep Learning Approach
    Yu, Jing
    Ye, Nianhua
    Du, Xueqin
    Han, Lu
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [30] Towards Slovak-English-Mandarin Speech Recognition Using Deep Learning
    Pleva, Matus
    Liao, Yuan-Fu
    Hsu, Wuhua
    Hladek, Daniel
    Stas, Jan
    Viszlay, Peter
    Lojka, Martin
    Juhar, Jozef
    PROCEEDINGS OF ELMAR-2018: 60TH INTERNATIONAL SYMPOSIUM ELMAR-2018, 2018, : 151 - 154