Category-based and Target-based Data Augmentation for Dysarthric Speech Recognition Using Transfer Learning

被引:0
|
作者
Nawroly, Sarkhell Sirwan [1 ]
Popescu, Decebal [1 ]
Antony, Mariya Celin T. H. E. K. E. K. A. R. A. [2 ]
机构
[1] Natl Univ Sci & Technol POLITEHN Bucharest, Fac Automat Control & Comp Sci, 313 Splaiul Independentei, Bucharest 060042, Romania
[2] Sai Univ, Sch Comp & Data Sci, Paiyanur 603104, Tamil Nadu, India
来源
STUDIES IN INFORMATICS AND CONTROL | 2024年 / 33卷 / 04期
关键词
Dysarthric speech recognition; Noise analysis; Transfer learning approach; NOISE;
D O I
10.24846/v33i4y202408
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dysarthric speech recognition poses unique challenges in comparison with normal speech recognition systems due to the scarcity of dysarthric speech data. To address this data sparsity issue, researchers have developed data augmentation techniques. These techniques utilize either the original dysarthric speech examples or speech data pertaining to normal speakers to generate new dysarthric speech data, thereby improving the dysarthric speech recognition performance. This study uses dysarthric speech examples to create augmented examples for training purposes in order to retain the identity of the dysarthric speakers in terms of their speech errors. A two-stage transfer learning strategy is employed, in the first stage of which a category-specific low-frequency noise augmentation method is introduced, while in its second stage a dysarthric speaker-specific data augmentation approach is implemented. The proposed method blends the advantages of various data augmentation approaches in the literature to develop a fine two-stage model that can handle data augmentation without compromising on the quality of the target model. This two-stage approach achieved a notable Word Error Rate (WER) reduction of approximately 11.369%, especially among the severely affected dysarthric speakers, by contrast to the transfer learning method that relies only on normal speech-related data for training.
引用
收藏
页数:130
相关论文
共 50 条
  • [21] Reinforcement Learning based Data Augmentation for Noise Robust Speech Emotion Recognition
    Ranjan, Sumit
    Chakraborty, Rupayan
    Kopparapu, Sunil Kumar
    INTERSPEECH 2024, 2024, : 1040 - 1044
  • [22] SENet-based speech emotion recognition using synthesis-style transfer data augmentation
    Rajan R.
    Hridya Raj T.V.
    International Journal of Speech Technology, 2023, 26 (04) : 1017 - 1030
  • [23] Learning about social category-based obligations
    Chalik, Lisa
    Rhodes, Marjorie
    COGNITIVE DEVELOPMENT, 2018, 48 : 117 - 124
  • [24] Improving Diacritical Arabic Speech Recognition: Transformer-Based Models with Transfer Learning and Hybrid Data Augmentation
    Alaqel, Haifa
    El Hindi, Khalil
    Information (Switzerland), 2025, 16 (03)
  • [25] Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation
    Bhat, Chitralekha
    Vachhani, Bhavik
    Kopparapu, Sunil
    Speech and Computer, 2016, 9811 : 370 - 377
  • [26] Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition
    Latha M.
    Shivakumar M.
    Manjula G.
    Hemakumar M.
    Kumar M.K.
    SN Computer Science, 4 (3)
  • [27] Android Malware Detection Using Category-Based Machine Learning Classifiers
    Alatwi, Huda Ali
    Oh, Tae
    Fokoue, Ernest
    Stackpole, Bill
    SIGITE'16: PROCEEDINGS OF THE 17TH ANNUAL CONFERENCE ON INFORMATION TECHNOLOGY EDUCATION, 2016, : 54 - 59
  • [28] Data augmentation method for underwater acoustic target recognition based on underwater acoustic channel modeling and transfer learning
    Li, Daihui
    Liu, Feng
    Shen, Tongsheng
    Chen, Liang
    Zhao, Dexin
    APPLIED ACOUSTICS, 2023, 208
  • [29] CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition
    Bao, Fang
    Neumann, Michael
    Ngoc Thang Vu
    INTERSPEECH 2019, 2019, : 2828 - 2832
  • [30] Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation
    Baek, Ji-Young
    Lee, Seok-Pil
    Tsihrintzis, George A.
    ELECTRONICS, 2023, 12 (18)