Modern Standard Arabic speech disorders corpus for digital speech processing applications

被引:0
|
作者
Alqudah A.A.M. [1 ,3 ]
Alshraideh M.A.M. [1 ]
Abushariah M.A.M. [2 ]
Sharieh A.A.S. [1 ]
机构
[1] Department of Computer Science, King Abdullah II School of Information Technology, The University of Jordan, Amman
[2] Department of Computer Information Systems, King Abdullah II School of Information Technology, The University of Jordan, Amman
[3] Department of Computer Science, Faculty of Science and Information Technology, Al-Zaytoonah University of Jordan, Amman
来源
Int J Speech Technol | 2024年 / 1卷 / 157-170期
关键词
Automatic speech recognition; CMU Pocketsphinx; HMM; LDA; MFCC; MLLT; Modern standard Arabic; Speech corpus; Speech disorders;
D O I
10.1007/s10772-024-10086-9
中图分类号
学科分类号
摘要
Digital speech processing applications including automatic speech recognition (ASR), speaker recognition, speech translation, and others, essentially require large volumes of speech data for training and testing purposes. Although there are available speech corpora, speech data for speakers suffering speech disorders are hardly available for many languages including Arabic language. Consequently, developing digital speech processing applications that target the entire society becomes hard due to the unavailability of speech corpora that contain sufficient speakers’ variations including healthy and disordered speech. This research presents our work towards developing a Modern Standard Arabic (MSA) speech corpus for speakers suffering distortion and substitution articulation disorders. The speech corpus was recorded by 40 (20 male and 20 female) Jordanian speakers who suffer either distortion or/and substitution articulation disorders. This speech corpus can be used for various applications including ASR, speech and hearing, and others. Part of this speech corpus is used for developing and evaluating an ASR for MSA using the Carnegie Mellon University (CMU) Pocketsphinx tools based on Mel-Frequency Cepstral Coefficients (MFCC) and Hidden Markov Model (HMM) techniques. Furthermore, Linear Discriminant Analysis (LDA) and Maximum Likelihood Linear Transform (MLLT) optimization techniques were applied. Using three different testing data sets, this work obtained 98.38% and 1.76% average word recognition correctness rate (WRCR) and average Word Error Rate (WER), respectively, for speaker-dependent and text-independent. For speaker-independent and text-dependent, this work obtained 99.37% and 0.68% average WRCR and average WER, respectively, whereas for speaker-independent and text-independent this work obtained 96.53% and 4.00% average WRCR and average WER, respectively. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
收藏
页码:157 / 170
页数:13
相关论文
共 50 条
  • [1] Modern Arabic speech corpus for Text to Speech synthesis
    Oumaima, Zine
    Meziane, Abdelouafi
    2020 IEEE INTERNATIONAL CONFERENCE ON TECHNOLOGY MANAGEMENT, OPERATIONS AND DECISIONS (ICTMOD), 2020,
  • [2] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
    Abushariah, Mohammad Abd-Alrahman Mahmoud
    Ainon, Raja Noor
    Zainuddin, Roziati
    Alqudah, Assal Ali Mustafa
    Ahmed, Moustafa Elshafei
    Khalifa, Othman Omran
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
  • [3] Impact of a Newly Developed Modern Standard Arabic Speech Corpus on Implementing and Evaluating Automatic Continuous Speech Recognition Systems
    Abushariah, Mohammad A. M.
    Ainon, Raja N.
    Zainuddin, Roziati
    Al-Qatab, Bassam A.
    Alqudah, Assal A. M.
    SPOKEN DIALOGUE SYSTEMS FOR AMBIENT ENVIRONMENTS, 2010, 6392 : 1 - 12
  • [4] The influence of English on Modern Standard Arabic speech reporting styles: A corpus-based study
    Al-Wahy, Ahmed Seddik
    LINGUA, 2021, 259
  • [5] Modern Standard Arabic Speech Corpora: A Systematic Review
    Alqadasi, Ammar Mohammed Ali
    Abdulghafor, Rawad
    Sunar, Mohd Shahrizal
    Salam, Md. Sah Bin H. J.
    IEEE ACCESS, 2023, 11 : 55771 - 55796
  • [6] Modern Standard Arabic Based Multilingual Approach for Dialectal Arabic Speech Recognition
    Elmahdy, Mohamed
    Gruhn, Rainer
    Minker, Wolfgang
    Abdennadher, Slim
    2009 EIGHTH INTERNATIONAL SYMPOSIUM ON NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2009, : 169 - +
  • [7] Synthesis of the intonation of neutrally spoken Modern Standard Arabic speech
    Ei-Imam, Yousif A.
    SIGNAL PROCESSING, 2008, 88 (09) : 2206 - 2221
  • [8] Colloquialising Modern Standard Arabic Text for Improved Speech Recognition
    Al-Shareef, Sarah
    Hain, Thomas
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1345 - 1349
  • [9] MASC: MASSIVE ARABIC SPEECH CORPUS
    Al-Fetyani, Mohammad
    Al-Barham, Muhammad
    Abandah, Gheith
    Alsharkawi, Adham
    Dawas, Maha
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1006 - 1013
  • [10] Phonetic Inventory for an Arabic Speech Corpus
    Halabi, Nawar
    Wald, Mike
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 734 - 738