A computer-aided speech analytics approach for pronunciation feedback using deep feature clustering

被引:0
|
作者
Faria Nazir
Muhammad Nadeem Majeed
Mustansar Ali Ghazanfar
Muazzam Maqsood
机构
[1] University of Engineering and Technology Taxila,Department of Software Engineering
[2] University of the Punjab,Department of Data Science
[3] University of East London,School of Architecture, Computing and Engineering
[4] COMSATS University Islamabad,Department of Computer Science
来源
Multimedia Systems | 2023年 / 29卷
关键词
Speech analytics; Deep convolutional neural network; Multimedia tools; Deep clustering; Phone variation model;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, the demand for language learning is increasing because people need to communicate with other people belonging to different regions for their business deals, study, etc. During language learning, a lot of pronunciation mistakes occur due to unfamiliarity with a new language and differences in accent. In this paper, we perform speech mistakes analysis using deep feature-based clustering. We proposed two novel methods for speech analysis, one to deal with phonemic errors (confusing phonemes) and the other to deal with the prosodic errors (partially changed pronunciation variation of phones). For accurate and efficient language learning, it is important to learn both phonemic as well as prosodic error corrections. In our first method, we perform speech analysis by combining deep CNN features and clustering algorithm to detect the phonemic errors. We classify the phonemes using K-nearest neighbor, Naïve Bayes, and support vector machine (SVM). We perform experiments on the six most frequently mispronounced confusing pairs of Arabic to handle phonemic errors and achieve an accuracy of 94%. In our second method, we proposed the unsupervised phone variation model (PVM) to detect prosodic errors. In PVM, each phone is extended to represent the different types of pronunciation variation of that phone with different proficiency levels. We use an Arabic dataset of 28 individual phones for speech analysis and provide feedback based on the variation of each phone and achieves an accuracy of 97%.
引用
收藏
页码:1699 / 1715
页数:16
相关论文
共 50 条
  • [21] AN APPROACH TO COMPUTER-AIDED SPECIFICATION
    MUELLERGLASER, KD
    BORTOLAZZI, J
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1990, 25 (02) : 335 - 345
  • [22] APPROACH TO COMPUTER-AIDED TRANSLATION
    LIPPMANN, EO
    IEEE TRANSACTIONS ON ENGINEERING WRITING AND SPEECH, 1971, EW14 (01): : 10 - 33
  • [23] Feature extraction and clustering for the computer-aided reconstruction of strip-cut shredded documents
    Ukovich, Anna
    Ramponi, Giovanni
    JOURNAL OF ELECTRONIC IMAGING, 2008, 17 (01)
  • [24] ARRAY SYNTHESIS USING A SIMPLE COMPUTER-AIDED APPROACH
    NG, BP
    ELECTRONICS LETTERS, 1990, 26 (05) : 337 - 339
  • [25] Special issue editorial on computer-aided advanced analytics
    Ma, Tieju
    JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING, 2016, 25 (03) : 269 - 270
  • [26] Special issue editorial on computer-aided advanced analytics
    Tieju Ma
    Journal of Systems Science and Systems Engineering, 2016, 25 : 269 - 270
  • [27] COMPUTER-AIDED SIGNAL HANDLING FOR SPEECH RESEARCH
    NAKATANI, LH
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 (04): : 1056 - 1062
  • [28] ASSISLT: Computer-aided speech therapy tool
    Bilkova, Zuzana
    Bartos, Michal
    Dominec, Adam
    Gresko, Simon
    Novozamsky, Adam
    Zitova, Barbara
    Paroubkova, Marketa
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 598 - 602
  • [29] Computer-aided diagnosis of cataract using deep transfer learning
    Pratap, Turimerla
    Kokil, Priyanka
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2019, 53
  • [30] Computer-Aided Diagnosis and Localization of Glaucoma Using Deep Learning
    Kim, Mijung
    Park, Ho-min
    Zuallaert, Jasper
    Janssens, Olivier
    Van Hoecke, Sofie
    De Neve, Wesley
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2357 - 2362