Privacy Preserving Acoustic Model Training for Speech Recognition

被引:0
|
作者
Tachioka, Yuuki [1 ]
机构
[1] Denso IT Lab, Tokyo, Japan
来源
2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2020年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In-domain speech data significantly improve the speech recognition performance of acoustic models. However, the data may contain confidential information and exposure of transcriptions may lead to a breach in speakers' privacy. In addition, speaker identification can be problematic when speakers want to hide their membership of a certain group. Thus, the in-domain data must be deleted after its period of use. However, once the data are deleted, models cannot be updated for future architectures. Privacy preservation is necessary when retaining speech data; it is important that the transcriptions cannot be reconstructed and the speaker cannot be identified. This paper proposes a privacy preserving acoustic model training (PPAMT) method that satisfies these requirements and formulates the sensitivities of three features (n-grams, phoneme labels, and acoustic features) for PPAMT. A sensitivity analysis showed that phoneme labels and acoustic features were less susceptible to PPAMT than n-grams, which is optimal because accurate phoneme labels and acoustic features are needed for acoustic model training. Speech recognition experiments showed that the word error rate degradation by PPAMT was less than 0.6% as a result of this property.
引用
收藏
页码:627 / 631
页数:5
相关论文
共 50 条
  • [41] Financial Fraud Recognition Model Based on Privacy Preserving and Federated Learning
    Liu, Qiu-Xiu
    Zhang, Jian-Guo
    Tan, Boris
    Journal of Network Intelligence, 2024, 9 (04): : 2360 - 2374
  • [42] Privacy-preserving model training architecture for intelligent edge computing
    Qu, Xidi
    Hu, Qin
    Wang, Shengling
    COMPUTER COMMUNICATIONS, 2020, 162 : 94 - 101
  • [43] Research on English speech recognition system and training enhancement based on bat algorithm and acoustic model inspection
    Yang, Xi
    Li, Ling
    SOFT COMPUTING, 2023,
  • [44] Adversarial Training for Privacy-Preserving Deep Learning Model Distribution
    Alawad, Mohammed
    Gao, Shang
    Wu, Xiao-Cheng
    Durbin, Eric B.
    Coyle, Linda
    Penberthy, Lynne
    Tourassi, Georgia
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5705 - 5710
  • [45] ACOUSTIC MODEL TRAINING FOR NON-AUDIBLE MURMUR RECOGNITION USING TRANSFORMED NORMAL SPEECH DATA
    Babani, Denis
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5224 - 5227
  • [46] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
    Wessel, F
    Ney, H
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310
  • [47] Acoustic synthesis of training data for speech recognition in living room environments
    Stahl, V
    Fischer, A
    Bippus, R
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 21 - 24
  • [48] CTC Training of Multi-Phone Acoustic Models for Speech Recognition
    Siohan, Olivier
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 709 - 713
  • [49] Acoustic synthesis of training data for speech recognition in living room environments
    Stahl, V
    Fischer, A
    Bippus, R
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 285 - 288
  • [50] An i-Vector based Approach to Acoustic Sniffing for Irrelevant Variability Normalization based Acoustic Model Training and Speech Recognition
    Xu, Jian
    Zhang, Yu
    Yan, Zhe-Jie
    Huo, Qiang
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1712 - 1715