Helicopter cockpit speech recognition method based on transfer learning and context biasing

被引:0
|
作者
Wang, Guotao [1 ,2 ]
Wang, Jiaqi [1 ]
Wang, Shicheng [3 ]
Wu, Qianyu [1 ]
Teng, Yuru [1 ]
机构
[1] Heilongjiang Univ, Sch Elect Engn, Harbin 150080, Peoples R China
[2] Harbin Inst Technol, Inst Reliabil Elect Apparat & Elect, Harbin 150001, Peoples R China
[3] Army Aviat Acad, Beijing 101121, Peoples R China
来源
ENGINEERING RESEARCH EXPRESS | 2024年 / 6卷 / 03期
基金
中国国家自然科学基金;
关键词
speech recognition; noise reduction; transfer learning; language model; context biasing; TRANSFORMER; NOISE;
D O I
10.1088/2631-8695/ad6bec
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Currently, Chinese speech recognition technology is generally designed for common domains, primarily focusing on accurate recognition of standard Mandarin Chinese in low-noise environments. However, helicopter cockpit speech presents unique challenges, characterized by high-noise environments, specific industry jargon, low contextual relevance, and a lack of publicly available datasets. To address these issues, this paper proposes a helicopter cockpit speech recognition method based on transfer learning and context biasing. By fine-tuning a general speech recognition model, we aim to better adapt it to the characteristics of speech in helicopter cockpits. This study explores noise reduction processing, context biasing, and speed perturbation in helicopter cockpit speech data. Combining pre-trained models with language models, we conduct transfer training to develop a specialized model for helicopter cockpit speech recognition. Finally, the effectiveness of this method is validated using a real dataset. Experimental results show that, on the helicopter speech dataset, this method reduces the word error rate from 72.69% to 12.58%. Furthermore, this approach provides an effective solution for small-sample speech recognition, enhancing model performance on limited datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Speech Emotion Recognition Based on Sparse Transfer Learning Method
    Song, Peng
    Zheng, Wenming
    Liang, Ruiyu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (07) : 1409 - 1412
  • [2] UNSUPERVISED CONTEXT LEARNING FOR SPEECH RECOGNITION
    Michaely, Assaf Hurwitz
    Ghodsi, Mohammadreza
    Wu, Zelin
    Scheiner, Justin
    Aleksic, Petar
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 447 - 453
  • [3] Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition
    Xu, Tianyi
    Yang, Zhanheng
    Huang, Kaixun
    Guo, Pengcheng
    Zhang, Ao
    Li, Biao
    Chen, Changru
    Li, Chao
    Xie, Lei
    INTERSPEECH 2023, 2023, : 1668 - 1672
  • [4] Transfer Learning for Speech Emotion Recognition
    Han Zhijie
    Zhao, Huijuan
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99
  • [5] Feature Selection Based Transfer Subspace Learning for Speech Emotion Recognition
    Song, Peng
    Zheng, Wenming
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (03) : 373 - 382
  • [6] Research on transfer learning for Khalkha Mongolian speech recognition based on TDNN
    Shi, Linyan
    Bao, Feilong
    Wang, Yonghe
    Gao, Guanglai
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 85 - 89
  • [7] Speech emotion recognition based on transfer learning from the FaceNet frameworka)
    Liu, Shuhua
    Zhang, Mengyu
    Fang, Ming
    Zhao, Jianwei
    Hou, Kun
    Hung, Chih-Cheng
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (02): : 1338 - 1345
  • [8] Transfer Learning Based Method for Human Activity Recognition
    Zebhi, Saeedeh
    AlModarresi, S. M. T.
    Abootalebi, Vahid
    2021 29TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2021, : 761 - 765
  • [9] Control Chart Recognition Method Based on Transfer Learning
    Xu, Xu-Dong
    Ma, Li-Qian
    2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 446 - 451
  • [10] Transfer learning for children's speech recognition
    Tong, Rong
    Wang, Lei
    Ma, Bin
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 36 - 39