A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning

被引:1
|
作者
Ryu, Hyungshin [1 ]
Kim, Sunhee [2 ]
Chung, Minhwa [1 ]
机构
[1] Seoul Natl Univ, Dept Linguist, Seoul, South Korea
[2] Seoul Natl Univ, Dept French Language Educ, Seoul, South Korea
来源
关键词
computer-assisted pronunciation training; multi-task learning; mispronunciation detection and diagnosis; automatic pronunciation assessment; transfer learning; SPEECH; COMPREHENSIBILITY; INTELLIGIBILITY; ACCENTEDNESS; GRANULARITY;
D O I
10.21437/Interspeech.2023-337
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Empirical studies report a strong correlation between pronunciation proficiency scores and phonetic errors in non-native speech assessments of human evaluators. However, the existing system of computer-assisted pronunciation training (CAPT) regards automatic pronunciation assessment (APA) and mis-pronunciation detection and diagnosis (MDD) as independent and focuses on individual performance improvement. Motivated by the correlation between two tasks, we propose a novel architecture that jointly tackles APA and MDD using CTC and cross-entropy criteria with a multi-task learning scheme to benefit both tasks. To leverage additional knowledge transfer, Wav2Vec2-robust finetuned on TIMIT is used for the joint optimization. The integrated model significantly outperforms single-task learning, with a mean of 0.057 PCC increase for APA and 0.004 F1 increase for MDD on Speechocean762, which reveals that proficiency scores and phonetic errors are correlated for both human and model assessments.
引用
收藏
页码:959 / 963
页数:5
相关论文
共 50 条
  • [41] A JOINT MULTI-TASK LEARNING FRAMEWORK FOR SPOKEN LANGUAGE UNDERSTANDING
    Li, Changliang
    Kong, Cunliang
    Zhao, Yan
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6054 - 6058
  • [42] Joint Sensing and Semantic Communications with Multi-Task Deep Learning
    Sagduyu, Yalin E.
    Erpek, Tugba
    Yener, Aylin
    Ulukus, Sennur
    IEEE COMMUNICATIONS MAGAZINE, 2024, 62 (09) : 74 - 81
  • [43] Deep Multi-Task Learning for Joint Localization, Perception, and Prediction
    Phillips, John
    Martinez, Julieta
    Barsan, Ioan Andrei
    Casas, Sergio
    Sadat, Abbas
    Urtasun, Raquel
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4677 - 4687
  • [44] Unsupervised Joint Multi-Task Learning of Vision Geometry Tasks
    Jha, Prabhash Kumar
    Tsanev, Doychin
    Lukic, Luka
    2021 IEEE INTELLIGENT VEHICLES SYMPOSIUM WORKSHOPS (IV WORKSHOPS), 2021, : 215 - 221
  • [45] System Strength Assessment Based on Multi-task Learning
    Li, Baoluo
    Xu, Shiyun
    Sun, Huadong
    Li, Zonghan
    Yu, Lin
    CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, 2024, 10 (01): : 41 - 50
  • [46] Variations of multi-task learning for spoken language assessment
    Wong, Jeremy H. M.
    Zhang, Huayun
    Chen, Nancy F.
    INTERSPEECH 2022, 2022, : 4456 - 4460
  • [47] MULTI-TASK LEARNING IMPROVES SYNTHETIC SPEECH DETECTION
    Mo, Yichuan
    Wang, Shilin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6392 - 6396
  • [48] Multi-Task Learning for Intrusion Detection on web logs
    Li, Bo
    Lin, Ying
    Zhang, Simin
    JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 81 : 92 - 100
  • [49] Multi-task Learning for Stance and Early Rumor Detection
    Chen, Yongheng
    Yin, Chunyan
    Zuo, Wanli
    OPTICAL MEMORY AND NEURAL NETWORKS, 2021, 30 (02) : 131 - 139
  • [50] Interdependent Multi-task Learning for Simultaneous Segmentation and Detection
    Reginthala, Mahesh
    Iwahori, Yuji
    Bhuyan, M. K.
    Hayashi, Yoshitsugu
    Achariyaviriya, Witsarut
    Kijsirikul, Boonserm
    ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 167 - 174