Privacy Preserving Acoustic Model Training for Speech Recognition

被引:0
|
作者
Tachioka, Yuuki [1 ]
机构
[1] Denso IT Lab, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In-domain speech data significantly improve the speech recognition performance of acoustic models. However, the data may contain confidential information and exposure of transcriptions may lead to a breach in speakers' privacy. In addition, speaker identification can be problematic when speakers want to hide their membership of a certain group. Thus, the in-domain data must be deleted after its period of use. However, once the data are deleted, models cannot be updated for future architectures. Privacy preservation is necessary when retaining speech data; it is important that the transcriptions cannot be reconstructed and the speaker cannot be identified. This paper proposes a privacy preserving acoustic model training (PPAMT) method that satisfies these requirements and formulates the sensitivities of three features (n-grams, phoneme labels, and acoustic features) for PPAMT. A sensitivity analysis showed that phoneme labels and acoustic features were less susceptible to PPAMT than n-grams, which is optimal because accurate phoneme labels and acoustic features are needed for acoustic model training. Speech recognition experiments showed that the word error rate degradation by PPAMT was less than 0.6% as a result of this property.
引用
收藏
页码:627 / 631
页数:5
相关论文
共 50 条
  • [1] Using Privacy-Transformed Speech in the Automatic Speech Recognition Acoustic Model Training
    Salimbajevs, Askars
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 47 - 54
  • [2] Acoustic model training for speech recognition over mobile networks
    Vojtko, Juraj
    Kacur, Juraj
    Rozinaj, Gregor
    Korosi, Jan
    INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2013, 6 (02) : 65 - 74
  • [3] Privacy-Preserving Speaker Verification and Speech Recognition
    Abbasi, Wisam
    EMERGING TECHNOLOGIES FOR AUTHORIZATION AND AUTHENTICATION, ETAA 2022, 2023, 13782 : 102 - 119
  • [4] Configurable Privacy-Preserving Automatic Speech Recognition
    Aloufi, Ranya
    Haddadi, Hamed
    Boyle, David
    INTERSPEECH 2021, 2021, : 861 - 865
  • [5] Joint Training of Speech Separation, Filterbank and Acoustic Model for Robust Automatic Speech Recognition
    Wang, Zhong-Qiu
    Wang, DeLiang
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2839 - 2843
  • [6] Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition
    Feng, Tiantian
    Narayanan, Shrikanth
    2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2021,
  • [7] Privacy-Preserving Personal Model Training
    Servia-Rodriguez, Sandra
    Wang, Liang
    Zhao, Jianxin R.
    Mortier, Richard
    Haddadi, Hamed
    2018 IEEE/ACM THIRD INTERNATIONAL CONFERENCE ON INTERNET-OF-THINGS DESIGN AND IMPLEMENTATION (IOTDI 2020), 2018, : 153 - 164
  • [8] Collaborative Training of Acoustic Encoders for Speech Recognition
    Nagaraja, Varun
    Shi, Yangyang
    Venkatesh, Ganesh
    Kalinli, Ozlem
    Seltzer, Michael L.
    Chandra, Vikas
    INTERSPEECH 2021, 2021, : 4573 - 4577
  • [9] Privacy-Preserving Outsourced Speech Recognition for Smart IoT Devices
    Ma, Zhuo
    Liu, Yang
    Liu, Ximeng
    Ma, Jianfeng
    Li, Feifei
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05): : 8406 - 8420
  • [10] Acoustic Model Adaptation for Speech Recognition
    Shinoda, Koichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2348 - 2362