Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes

被引：11

作者：

El Amrani, Mohamed Yassine ^{[1
,2
]}

Rahman, M. M. Hafizur ^{[2
]}

Wahiddin, Mohamed Ridza ^{[2
]}

Shah, Asadullah ^{[2
]}

机构：

[1] Jubail Univ Coll, Dept Comp Sci & Engn, Yanbu, Al Jubail, Saudi Arabia

[2] Int Islamic Univ Malaysia, Dept Comp Sci, Kulliah Informat Commun Technol, Kuala Lumpur, Selangor, Malaysia

来源：

EGYPTIAN INFORMATICS JOURNAL | 2016年 / 17卷 / 03期

关键词：

Automatic speech recognition; Holy Quran recognition; Human voice;

D O I：

10.1016/j.eij.2016.04.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%. (C) 2016 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

引用

页码：305 / 314

页数：10

共 46 条

[1] Building a Rich Arabic Speech and Language Corpus Based on the Holy Quran
Meftah, Ali
Seddiq, Yasser
Alotaibi, Yousef
Selouani, Sid-Ahmed
ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, 2018, 782 : 90 - 101
[2] Testing Sphinx's language model fault-tolerance for the Holy Quran
El Amrani, Mohamed Yassine
Rahman, M. M. Hafizur
Wahiddin, Mohamed Ridza
Shah, Asadullah
2016 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR THE MUSLIM WORLD (ICT4M), 2016, : 88 - 92
[3] Investigation Arabic Speech Recognition Using CMU Sphinx System
Satori, Hassan
Hiyassat, Hussein
Harti, Mostafa
Chenfour, Noureddine
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2009, 6 (02) : 186 - 190
[4] Ontology-Based Model for Arabic Lexicons: An Application of the Place Nouns in the Holy Quran
Alromima, Waseem
Moawad, Ibrahim F.
Elgohary, Rania
Aref, Mostafa
2015 11TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2015, : 137 - 143
[5] Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition
Nahar, Khalid M. O.
Abu Shquier, Mohammed
Al-Khatib, Wasfi G.
Al-Muhtaseb, Husni
Elshafei, Moustafa
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 495 - 508
[6] A Model for Implementing e-Teaching Objects for the Holy Quran and Related Sciences using Animations
Basuhail, Abdullah Ahmad
2013 TAIBAH UNIVERSITY INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY FOR THE HOLY QURAN AND ITS SCIENCES, 2013, : 83 - 88
[7] Chatbot in Arabic language using seq to seq model
Boussakssou, M.
Ezzikouri, H.
Erritali, M.
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (02) : 2859 - 2871
[8] Chatbot in Arabic language using seq to seq model
M. Boussakssou
H. Ezzikouri
M. Erritali
Multimedia Tools and Applications, 2022, 81 : 2859 - 2871
[9] Exploitation of an Arabic Language Resource for MT Evaluation: Using Buckwalter-based Lookup Tool to Augment CMU Alignment Algorithm
Voss, Clare R.
Laoudi, Jamal
Micher, Jeffrey
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3498 - 3505
[10] Building a First Language Model for Code-switch Arabic-English
Hamed, Injy
Elmahdy, Mohamed
Abdennadher, Slim
ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 208 - 216

← 1 2 3 4 5 →