Training Augmentation with Adversarial Examples for Robust Speech Recognition

被引:18
|
作者
Sun, Sining [1 ]
Yeh, Ching-Feng [2 ]
Ostendorf, Mari [3 ]
Hwang, Mei-Yuh [2 ]
Xie, Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China
[2] Mobvoi AI Lab, Seattle, WA USA
[3] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
基金
中国国家自然科学基金;
关键词
robust speech recognition; adversarial examples; FGSM; data augmentation; teacher-student model; ADAPTATION;
D O I
10.21437/Interspeech.2018-1247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models. During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data. Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters. We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation. Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.
引用
收藏
页码:2404 / 2408
页数:5
相关论文
共 50 条
  • [1] Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
    Qin, Yao
    Carlini, Nicholas
    Goodfellow, Ian
    Cottrell, Garrison
    Raffel, Colin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [2] Data augmentation using generative adversarial networks for robust speech recognition
    Qian, Yanmin
    Hu, Hu
    Tan, Tian
    SPEECH COMMUNICATION, 2019, 114 : 1 - 9
  • [3] AudioGuard: Speech Recognition System Robust against Optimized Audio Adversarial Examples
    Kwon, Hyun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 57943 - 57962
  • [4] GENERATIVE ADVERSARIAL NETWORKS BASED DATA AUGMENTATION FOR NOISE ROBUST SPEECH RECOGNITION
    Hu, Hu
    Tan, Tian
    Qian, Yanmin
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5044 - 5048
  • [5] Data Augmentation using Conditional Generative Adversarial Networks for Robust Speech Recognition
    Sheng, Peiyao
    Yang, Zhuolin
    Hu, Hu
    Tan, Tian
    Qian, Yanmin
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 121 - 125
  • [6] Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
    Wang, Longshaokan
    Fazel-zarandi, Maryam
    Tiwari, Aditya
    Matsoukas, Spyros
    Polymenakos, Lazaros
    NLP FOR CONVERSATIONAL AI, 2020, : 63 - 70
  • [7] Adversarial Data Augmentation for Disordered Speech Recognition
    Jin, Zengrui
    Geng, Mengzhe
    Xie, Xurong
    Yu, Jianwei
    Liu, Shansong
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2021, 2021, : 4803 - 4807
  • [8] IMPERIO: Robust Over-the-Air Adversarial Examples for Automatic Speech Recognition Systems
    Schoenherr, Lea
    Eisenhofer, Thorsten
    Zeiler, Steffen
    Holz, Thorsten
    Kolossa, Dorothea
    36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020), 2020, : 843 - 855
  • [9] Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition
    Liu, Bin
    Nie, Shuai
    Liang, Shan
    Liu, Wenju
    Yu, Meng
    Chen, Lianwu
    Peng, Shouye
    Li, Changliang
    INTERSPEECH 2019, 2019, : 491 - 495
  • [10] Robust Automatic Speech Recognition via WavAugment Guided Phoneme Adversarial Training
    Qi, Gege
    Chen, Yuefeng
    Mao, Xiaofeng
    Jia, Xiaojun
    Duan, Ranjie
    Zhang, Rong
    Xue, Hui
    INTERSPEECH 2023, 2023, : 561 - 565