Privacy Preserving Acoustic Model Training for Speech Recognition

被引：0

作者：

Tachioka, Yuuki ^{[1
]}

机构：

[1] Denso IT Lab, Tokyo, Japan

来源：

2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In-domain speech data significantly improve the speech recognition performance of acoustic models. However, the data may contain confidential information and exposure of transcriptions may lead to a breach in speakers' privacy. In addition, speaker identification can be problematic when speakers want to hide their membership of a certain group. Thus, the in-domain data must be deleted after its period of use. However, once the data are deleted, models cannot be updated for future architectures. Privacy preservation is necessary when retaining speech data; it is important that the transcriptions cannot be reconstructed and the speaker cannot be identified. This paper proposes a privacy preserving acoustic model training (PPAMT) method that satisfies these requirements and formulates the sensitivities of three features (n-grams, phoneme labels, and acoustic features) for PPAMT. A sensitivity analysis showed that phoneme labels and acoustic features were less susceptible to PPAMT than n-grams, which is optimal because accurate phoneme labels and acoustic features are needed for acoustic model training. Speech recognition experiments showed that the word error rate degradation by PPAMT was less than 0.6% as a result of this property.

引用

页码：627 / 631

页数：5

共 50 条

[1] Using Privacy-Transformed Speech in the Automatic Speech Recognition Acoustic Model Training
Salimbajevs, Askars
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 47 - 54
[2] Acoustic model training for speech recognition over mobile networks
Vojtko, Juraj
Kacur, Juraj
Rozinaj, Gregor
Korosi, Jan
INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2013, 6 (02) : 65 - 74
[3] Privacy-Preserving Speaker Verification and Speech Recognition
Abbasi, Wisam
EMERGING TECHNOLOGIES FOR AUTHORIZATION AND AUTHENTICATION, ETAA 2022, 2023, 13782 : 102 - 119
[4] Configurable Privacy-Preserving Automatic Speech Recognition
Aloufi, Ranya
Haddadi, Hamed
Boyle, David
INTERSPEECH 2021, 2021, : 861 - 865
[5] Joint Training of Speech Separation, Filterbank and Acoustic Model for Robust Automatic Speech Recognition
Wang, Zhong-Qiu
Wang, DeLiang
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2839 - 2843
[6] Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition
Feng, Tiantian
Narayanan, Shrikanth
2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2021,
[7] Privacy-Preserving Personal Model Training
Servia-Rodriguez, Sandra
Wang, Liang
Zhao, Jianxin R.
Mortier, Richard
Haddadi, Hamed
2018 IEEE/ACM THIRD INTERNATIONAL CONFERENCE ON INTERNET-OF-THINGS DESIGN AND IMPLEMENTATION (IOTDI 2020), 2018, : 153 - 164
[8] Collaborative Training of Acoustic Encoders for Speech Recognition
Nagaraja, Varun
Shi, Yangyang
Venkatesh, Ganesh
Kalinli, Ozlem
Seltzer, Michael L.
Chandra, Vikas
INTERSPEECH 2021, 2021, : 4573 - 4577
[9] Privacy-Preserving Outsourced Speech Recognition for Smart IoT Devices
Ma, Zhuo
Liu, Yang
Liu, Ximeng
Ma, Jianfeng
Li, Feifei
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05): : 8406 - 8420
[10] Acoustic Model Adaptation for Speech Recognition
Shinoda, Koichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2348 - 2362

← 1 2 3 4 5 →