Speech Fatigue Recognition Under Small Samples Based on Generative Adversarial Networks and BLSTM

被引:0
|
作者
Chen, Shuxi [1 ]
Qiu, Jianlin [2 ]
Zhang, Haifei [1 ]
Yu, Yifan [1 ]
Chen, Hao [3 ]
Sun, Yiyang [1 ]
机构
[1] Nantong Inst Technol, Sch Comp & Informat Engn, Yongxing Rd 211, Nantong 226002, Peoples R China
[2] Nantong Univ, Sch Informat Sci & Technol, Seyuan Rd 9, Nantong 226019, Peoples R China
[3] Royal Inst Technol KTH, Sch Elect Engn & Comp Sci, Brinellvagen 8, S-11428 Stockholm, Sweden
关键词
GAN; speech fatigue recognition; small samples; data augmentation; BLSTM; transfer learning;
D O I
10.1142/S0218001424580059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To address the issue of low accuracy in speech fatigue recognition (SFR) under small samples, a method for small-sample SFR based on generative adversarial networks (GANs) is proposed. First, we enable the generator and discriminator to adversarially train and learn the features of the samples, and use the generator to generate high-quality simulated samples to expand our dataset. Then, we transfer discriminator parameters to fatigue identification network to accelerate network training speed. Furthermore, we use a bidirectional long short-term memory network (BLSTM) to further learn temporal fatigue features and improve the recognition rate of fatigue. 720 speech samples from a self-made Chinese speech database (SUSP-SFD) were chosen for training and testing. The results indicate that compared with traditional SFR methods, like convolutional neural networks (CNNs) and long short-term memory network (LSTM), our method improved the SFR rate by about 2.3-6.7%, verifying the effectiveness of the method.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Pipeline leak detection based on generative adversarial networks under small samples
    Wang, Dongmei
    Sun, Ying
    Lu, Jingyi
    FLOW MEASUREMENT AND INSTRUMENTATION, 2025, 101
  • [2] Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
    Wang, Ke
    Zhang, Junbo
    Sun, Sining
    Wang, Yujun
    Xiang, Fei
    Xie, Lei
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1581 - 1585
  • [3] ROBUST SPEECH RECOGNITION USING GENERATIVE ADVERSARIAL NETWORKS
    Sriram, Anuroop
    Jun, Heewoo
    Gaur, Yashesh
    Satheesh, Sanjeev
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5639 - 5643
  • [4] Augmenting Generative Adversarial Networks for Speech Emotion Recognition
    Latif, Siddique
    Asim, Muhammad
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Schuller, Bjoern W.
    INTERSPEECH 2020, 2020, : 521 - 525
  • [5] EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Donahue, Chris
    Li, Bo
    Prabhavalkar, Rohit
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5024 - 5028
  • [6] GENERATIVE ADVERSARIAL NETWORKS BASED DATA AUGMENTATION FOR NOISE ROBUST SPEECH RECOGNITION
    Hu, Hu
    Tan, Tian
    Qian, Yanmin
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5044 - 5048
  • [7] On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
    Sahu, Saurabh
    Gupta, Rahul
    Espy-Wilson, Carol
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3693 - 3697
  • [8] Cotton Fusarium wilt diagnosis based on generative adversarial networks in small samples
    Zhang, Zhenghang
    Ma, Lulu
    Wei, Chunyue
    Yang, Mi
    Qin, Shizhe
    Lv, Xin
    Zhang, Ze
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [9] Applying Generative Adversarial Networks and Vision Transformers in Speech Emotion Recognition
    Heracleous, Panikos
    Fukayama, Satoru
    Ogata, Jun
    Mohammad, Yasser
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13519 LNCS : 67 - 75
  • [10] Data augmentation using generative adversarial networks for robust speech recognition
    Qian, Yanmin
    Hu, Hu
    Tan, Tian
    SPEECH COMMUNICATION, 2019, 114 : 1 - 9