Privacy Preserving Acoustic Model Training for Speech Recognition

被引：0

作者：

Tachioka, Yuuki ^{[1
]}

机构：

[1] Denso IT Lab, Tokyo, Japan

来源：

2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In-domain speech data significantly improve the speech recognition performance of acoustic models. However, the data may contain confidential information and exposure of transcriptions may lead to a breach in speakers' privacy. In addition, speaker identification can be problematic when speakers want to hide their membership of a certain group. Thus, the in-domain data must be deleted after its period of use. However, once the data are deleted, models cannot be updated for future architectures. Privacy preservation is necessary when retaining speech data; it is important that the transcriptions cannot be reconstructed and the speaker cannot be identified. This paper proposes a privacy preserving acoustic model training (PPAMT) method that satisfies these requirements and formulates the sensitivities of three features (n-grams, phoneme labels, and acoustic features) for PPAMT. A sensitivity analysis showed that phoneme labels and acoustic features were less susceptible to PPAMT than n-grams, which is optimal because accurate phoneme labels and acoustic features are needed for acoustic model training. Speech recognition experiments showed that the word error rate degradation by PPAMT was less than 0.6% as a result of this property.

引用

页码：627 / 631

页数：5

共 50 条

[31] Crosslingual acoustic model development for automatic speech recognition
Diehl, Frank
Moreno, Asuncion
Monte, Enric
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 425 - 430
[32] Improving of Acoustic Model for the Mongolian Speech Recognition System
Bao, Feilong
Gao, Guanglai
PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 616 - 620
[33] Speech Emotion Recognition Based on Acoustic Segment Model
Zheng, Siyuan
Du, Jun
Zhou, Hengshun
Bai, Xue
Lee, Chin-Hui
Li, Shipeng
2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[34] Integration of metamodel and acoustic model for dysarthric speech recognition
Matsumasa, Hironori
Takiguchi, Tetsuya
Ariki, Yasuo
Li, I-Chao
Nakabayashi, Toshitaka
Journal of Multimedia, 2009, 4 (04): : 254 - 261
[35] Federated Acoustic Model Optimization for Automatic Speech Recognition
Tan, Conghui
Jiang, Di
Mo, Huaxiao
Peng, Jinhua
Tong, Yongxin
Zhao, Weiwei
Chen, Chaotao
Lian, Rongzhong
Song, Yuanfeng
Xu, Qian
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT III, 2020, 12114 : 771 - 774
[36] (Speech recognition based on Spanish accent acoustic model)
Plaza, Johanna
Sanchez-Zhunio, Cristina
Acosta-Uriguen, Maria-Ines
Orellana, Marcos
Cedillo, Priscila
Zambrano-Martinez, Jorge Luis
ENFOQUE UTE, 2022, 13 (03): : 45 - 57
[37] Researching of Speech Recognition Oriented Mongolian Acoustic Model
HaSi Qilao
Guanglai Gao
PROCEEDINGS OF THE 2008 CHINESE CONFERENCE ON PATTERN RECOGNITION (CCPR 2008), 2008, : 406 - 411
[38] Privacy-Preserving Face Recognition
Erkin, Zekeriya
Franz, Martin
Guajardo, Jorge
Katzenbeisser, Stefan
Lagendijk, Inald
Toftt, Tomas
PRIVACY ENHANCING TECHNOLOGIES, PROCEEDINGS, 2009, 5672 : 235 - +
[39] PRIVACY-PRESERVING ACTION RECOGNITION
Zou, Chengming
Yuan, Ducheng
Lan, Long
Chi, Haoang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2175 - 2179
[40] Privacy Preserving Driving Style Recognition
Rizzo, Nicholas
Sprissler, Ethan
Hong, Yuan
Goel, Sanjay
2015 INTERNATIONAL CONFERENCE ON CONNECTED VEHICLES AND EXPO (ICCVE), 2015, : 232 - 237

← 1 2 3 4 5 →