A computer-aided speech analytics approach for pronunciation feedback using deep feature clustering

被引：0

作者：

Faria Nazir

Muhammad Nadeem Majeed

Mustansar Ali Ghazanfar

Muazzam Maqsood

机构：

[1] University of Engineering and Technology Taxila,Department of Software Engineering

[2] University of the Punjab,Department of Data Science

[3] University of East London,School of Architecture, Computing and Engineering

[4] COMSATS University Islamabad,Department of Computer Science

来源：

Multimedia Systems | 2023年 / 29卷

关键词：

Speech analytics; Deep convolutional neural network; Multimedia tools; Deep clustering; Phone variation model;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Nowadays, the demand for language learning is increasing because people need to communicate with other people belonging to different regions for their business deals, study, etc. During language learning, a lot of pronunciation mistakes occur due to unfamiliarity with a new language and differences in accent. In this paper, we perform speech mistakes analysis using deep feature-based clustering. We proposed two novel methods for speech analysis, one to deal with phonemic errors (confusing phonemes) and the other to deal with the prosodic errors (partially changed pronunciation variation of phones). For accurate and efficient language learning, it is important to learn both phonemic as well as prosodic error corrections. In our first method, we perform speech analysis by combining deep CNN features and clustering algorithm to detect the phonemic errors. We classify the phonemes using K-nearest neighbor, Naïve Bayes, and support vector machine (SVM). We perform experiments on the six most frequently mispronounced confusing pairs of Arabic to handle phonemic errors and achieve an accuracy of 94%. In our second method, we proposed the unsupervised phone variation model (PVM) to detect prosodic errors. In PVM, each phone is extended to represent the different types of pronunciation variation of that phone with different proficiency levels. We use an Arabic dataset of 28 individual phones for speech analysis and provide feedback based on the variation of each phone and achieves an accuracy of 97%.

引用

页码：1699 / 1715

页数：16

共 50 条

[21] AN APPROACH TO COMPUTER-AIDED SPECIFICATION
MUELLERGLASER, KD
BORTOLAZZI, J
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1990, 25 (02) : 335 - 345
[22] APPROACH TO COMPUTER-AIDED TRANSLATION
LIPPMANN, EO
IEEE TRANSACTIONS ON ENGINEERING WRITING AND SPEECH, 1971, EW14 (01): : 10 - 33
[23] Feature extraction and clustering for the computer-aided reconstruction of strip-cut shredded documents
Ukovich, Anna
Ramponi, Giovanni
JOURNAL OF ELECTRONIC IMAGING, 2008, 17 (01)
[24] ARRAY SYNTHESIS USING A SIMPLE COMPUTER-AIDED APPROACH
NG, BP
ELECTRONICS LETTERS, 1990, 26 (05) : 337 - 339
[25] Special issue editorial on computer-aided advanced analytics
Ma, Tieju
JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING, 2016, 25 (03) : 269 - 270
[26] Special issue editorial on computer-aided advanced analytics
Tieju Ma
Journal of Systems Science and Systems Engineering, 2016, 25 : 269 - 270
[27] COMPUTER-AIDED SIGNAL HANDLING FOR SPEECH RESEARCH
NAKATANI, LH
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 (04): : 1056 - 1062
[28] ASSISLT: Computer-aided speech therapy tool
Bilkova, Zuzana
Bartos, Michal
Dominec, Adam
Gresko, Simon
Novozamsky, Adam
Zitova, Barbara
Paroubkova, Marketa
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 598 - 602
[29] Computer-aided diagnosis of cataract using deep transfer learning
Pratap, Turimerla
Kokil, Priyanka
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2019, 53
[30] Computer-Aided Diagnosis and Localization of Glaucoma Using Deep Learning
Kim, Mijung
Park, Ho-min
Zuallaert, Jasper
Janssens, Olivier
Van Hoecke, Sofie
De Neve, Wesley
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2357 - 2362

← 1 2 3 4 5 →