A computer-aided speech analytics approach for pronunciation feedback using deep feature clustering

被引:0
|
作者
Faria Nazir
Muhammad Nadeem Majeed
Mustansar Ali Ghazanfar
Muazzam Maqsood
机构
[1] University of Engineering and Technology Taxila,Department of Software Engineering
[2] University of the Punjab,Department of Data Science
[3] University of East London,School of Architecture, Computing and Engineering
[4] COMSATS University Islamabad,Department of Computer Science
来源
Multimedia Systems | 2023年 / 29卷
关键词
Speech analytics; Deep convolutional neural network; Multimedia tools; Deep clustering; Phone variation model;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, the demand for language learning is increasing because people need to communicate with other people belonging to different regions for their business deals, study, etc. During language learning, a lot of pronunciation mistakes occur due to unfamiliarity with a new language and differences in accent. In this paper, we perform speech mistakes analysis using deep feature-based clustering. We proposed two novel methods for speech analysis, one to deal with phonemic errors (confusing phonemes) and the other to deal with the prosodic errors (partially changed pronunciation variation of phones). For accurate and efficient language learning, it is important to learn both phonemic as well as prosodic error corrections. In our first method, we perform speech analysis by combining deep CNN features and clustering algorithm to detect the phonemic errors. We classify the phonemes using K-nearest neighbor, Naïve Bayes, and support vector machine (SVM). We perform experiments on the six most frequently mispronounced confusing pairs of Arabic to handle phonemic errors and achieve an accuracy of 94%. In our second method, we proposed the unsupervised phone variation model (PVM) to detect prosodic errors. In PVM, each phone is extended to represent the different types of pronunciation variation of that phone with different proficiency levels. We use an Arabic dataset of 28 individual phones for speech analysis and provide feedback based on the variation of each phone and achieves an accuracy of 97%.
引用
收藏
页码:1699 / 1715
页数:16
相关论文
共 50 条
  • [31] Computer-Aided Classification of Breast Tumors Using the Affinity Propagation Clustering
    Su, Yanni
    Wang, Yuanyuan
    2010 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING (ICBBE 2010), 2010,
  • [32] How effective is feedback in Computer-Aided Assessments?
    Gill, Mundeep
    Greenhow, Martin
    LEARNING MEDIA AND TECHNOLOGY, 2008, 33 (03) : 207 - 220
  • [33] Computer Aided Qur'an Pronunciation using DNN
    Al-Marri, Mubarak
    Raafat, Hazem
    Abdallah, Mustafa
    Abdou, Sherif
    Rashwan, Mohsen
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3257 - 3271
  • [34] Computer-aided diagnosis of retinal diseases using multidomain feature fusion
    Keerthiveena, B.
    Esakkirajan, S.
    Selvakumar, K.
    Yogesh, T.
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2020, 30 (02) : 367 - 379
  • [35] Deep Learning Approach in Computer-Aided Detection System for Lung Cancer
    Chapaliuk, Bohdan
    Zaychenko, Yuriy
    2018 IEEE FIRST INTERNATIONAL CONFERENCE ON SYSTEM ANALYSIS & INTELLIGENT COMPUTING (SAIC), 2018, : 165 - 168
  • [36] Feature selection for computer-aided polyp detection using genetic algorithms
    Miller, MT
    Jerebko, AK
    Malley, JD
    Summers, RM
    MEDICAL IMAGING 2003: PHYSIOLOGY AND FUNCTION: METHODS, SYSTEMS, AND APPLICATIONS, 2003, 5031 : 102 - 110
  • [37] Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training
    Kroeger, Bernd J.
    Birkholz, Peter
    Hoffmann, Ruediger
    Meng, Helen
    DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 337 - +
  • [38] Computer-Aided Feedback of Surgical Knot Tying Using Optical Tracking
    Watson, Robert Anthony
    JOURNAL OF SURGICAL EDUCATION, 2012, 69 (03) : 306 - 310
  • [39] Multicriteria-Based Computer-Aided Pronunciation Quality Evaluation of Sentences
    Becerra Yoma, Nestor
    Benavides Berrios, Leopoldo
    Wuth Sepulveda, Jorge
    Vivanco Torres, Hiram
    ETRI JOURNAL, 2013, 35 (01) : 89 - 99
  • [40] Optimization of computer-aided english pronunciation training data analysis system
    Liang C.
    Shang J.
    Computer-Aided Design and Applications, 2021, 18 (s4): : 37 - 48