Privacy Preserving Acoustic Model Training for Speech Recognition

被引：0

作者：

Tachioka, Yuuki ^{[1
]}

机构：

[1] Denso IT Lab, Tokyo, Japan

来源：

2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In-domain speech data significantly improve the speech recognition performance of acoustic models. However, the data may contain confidential information and exposure of transcriptions may lead to a breach in speakers' privacy. In addition, speaker identification can be problematic when speakers want to hide their membership of a certain group. Thus, the in-domain data must be deleted after its period of use. However, once the data are deleted, models cannot be updated for future architectures. Privacy preservation is necessary when retaining speech data; it is important that the transcriptions cannot be reconstructed and the speaker cannot be identified. This paper proposes a privacy preserving acoustic model training (PPAMT) method that satisfies these requirements and formulates the sensitivities of three features (n-grams, phoneme labels, and acoustic features) for PPAMT. A sensitivity analysis showed that phoneme labels and acoustic features were less susceptible to PPAMT than n-grams, which is optimal because accurate phoneme labels and acoustic features are needed for acoustic model training. Speech recognition experiments showed that the word error rate degradation by PPAMT was less than 0.6% as a result of this property.

引用

页码：627 / 631

页数：5

共 50 条

[41] Financial Fraud Recognition Model Based on Privacy Preserving and Federated Learning
Liu, Qiu-Xiu
Zhang, Jian-Guo
Tan, Boris
Journal of Network Intelligence, 2024, 9 (04): : 2360 - 2374
[42] Privacy-preserving model training architecture for intelligent edge computing
Qu, Xidi
Hu, Qin
Wang, Shengling
COMPUTER COMMUNICATIONS, 2020, 162 : 94 - 101
[43] Research on English speech recognition system and training enhancement based on bat algorithm and acoustic model inspection
Yang, Xi
Li, Ling
SOFT COMPUTING, 2023,
[44] Adversarial Training for Privacy-Preserving Deep Learning Model Distribution
Alawad, Mohammed
Gao, Shang
Wu, Xiao-Cheng
Durbin, Eric B.
Coyle, Linda
Penberthy, Lynne
Tourassi, Georgia
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5705 - 5710
[45] ACOUSTIC MODEL TRAINING FOR NON-AUDIBLE MURMUR RECOGNITION USING TRANSFORMED NORMAL SPEECH DATA
Babani, Denis
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5224 - 5227
[46] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
Wessel, F
Ney, H
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310
[47] Acoustic synthesis of training data for speech recognition in living room environments
Stahl, V
Fischer, A
Bippus, R
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 21 - 24
[48] CTC Training of Multi-Phone Acoustic Models for Speech Recognition
Siohan, Olivier
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 709 - 713
[49] Acoustic synthesis of training data for speech recognition in living room environments
Stahl, V
Fischer, A
Bippus, R
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 285 - 288
[50] An i-Vector based Approach to Acoustic Sniffing for Irrelevant Variability Normalization based Acoustic Model Training and Speech Recognition
Xu, Jian
Zhang, Yu
Yan, Zhe-Jie
Huo, Qiang
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1712 - 1715

← 1 2 3 4 5 →