Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes

被引:11
|
作者
El Amrani, Mohamed Yassine [1 ,2 ]
Rahman, M. M. Hafizur [2 ]
Wahiddin, Mohamed Ridza [2 ]
Shah, Asadullah [2 ]
机构
[1] Jubail Univ Coll, Dept Comp Sci & Engn, Yanbu, Al Jubail, Saudi Arabia
[2] Int Islamic Univ Malaysia, Dept Comp Sci, Kulliah Informat Commun Technol, Kuala Lumpur, Selangor, Malaysia
关键词
Automatic speech recognition; Holy Quran recognition; Human voice;
D O I
10.1016/j.eij.2016.04.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%. (C) 2016 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:305 / 314
页数:10
相关论文
共 46 条
  • [1] Building a Rich Arabic Speech and Language Corpus Based on the Holy Quran
    Meftah, Ali
    Seddiq, Yasser
    Alotaibi, Yousef
    Selouani, Sid-Ahmed
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, 2018, 782 : 90 - 101
  • [2] Testing Sphinx's language model fault-tolerance for the Holy Quran
    El Amrani, Mohamed Yassine
    Rahman, M. M. Hafizur
    Wahiddin, Mohamed Ridza
    Shah, Asadullah
    2016 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR THE MUSLIM WORLD (ICT4M), 2016, : 88 - 92
  • [3] Investigation Arabic Speech Recognition Using CMU Sphinx System
    Satori, Hassan
    Hiyassat, Hussein
    Harti, Mostafa
    Chenfour, Noureddine
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2009, 6 (02) : 186 - 190
  • [4] Ontology-Based Model for Arabic Lexicons: An Application of the Place Nouns in the Holy Quran
    Alromima, Waseem
    Moawad, Ibrahim F.
    Elgohary, Rania
    Aref, Mostafa
    2015 11TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2015, : 137 - 143
  • [5] Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition
    Nahar, Khalid M. O.
    Abu Shquier, Mohammed
    Al-Khatib, Wasfi G.
    Al-Muhtaseb, Husni
    Elshafei, Moustafa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 495 - 508
  • [6] A Model for Implementing e-Teaching Objects for the Holy Quran and Related Sciences using Animations
    Basuhail, Abdullah Ahmad
    2013 TAIBAH UNIVERSITY INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY FOR THE HOLY QURAN AND ITS SCIENCES, 2013, : 83 - 88
  • [7] Chatbot in Arabic language using seq to seq model
    Boussakssou, M.
    Ezzikouri, H.
    Erritali, M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (02) : 2859 - 2871
  • [8] Chatbot in Arabic language using seq to seq model
    M. Boussakssou
    H. Ezzikouri
    M. Erritali
    Multimedia Tools and Applications, 2022, 81 : 2859 - 2871
  • [9] Exploitation of an Arabic Language Resource for MT Evaluation: Using Buckwalter-based Lookup Tool to Augment CMU Alignment Algorithm
    Voss, Clare R.
    Laoudi, Jamal
    Micher, Jeffrey
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3498 - 3505
  • [10] Building a First Language Model for Code-switch Arabic-English
    Hamed, Injy
    Elmahdy, Mohamed
    Abdennadher, Slim
    ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 208 - 216