A discrete arabic script for better automatic document understanding

被引:0
|
作者
Abuhaiba, ISI [1 ]
机构
[1] Islam Univ Gaza, Dept Elect & Comp Engn, Gaza, Israel
来源
关键词
document understanding; cursive arabic script; discrete arabic script; character segmentation; TrueType font; left white space; right white space;
D O I
暂无
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper lays the groundwork for the development of new fonts to produce discrete Arabic script, for the first time, instead of cursive Arabic script. These fonts help in automatic document understanding and can be used to print books, newspapers, periodicals, and all other printed materials. Of course, all other properties of Arabic writing system are preserved when producing such fonts. The history of Arabic calligraphy since its beginning provides a strong defense of our call to break the cursive law of Arabic script. We could develop new fonts for discrete Arabic typography such that the characters can be segmented with simple vertical white cuts. Two parameters are investigated to suit the new requirements: left and right white spaces. Nine A4 pages of Arabic script were used in our experiments to empirically determine a sufficient amount of these spaces. A font with left and right spaces of 160 FUnits each, achieved a segmentation success rate of 99.99%.
引用
收藏
页码:77 / 94
页数:18
相关论文
共 50 条
  • [31] A MATHEMATICAL ALGORITHM FOR ARABIC SCRIPT SHAPE EVALUATION
    MAHIR, AN
    ABBAS, SAH
    INTERNATIONAL JOURNAL OF ELECTRONICS, 1993, 74 (06) : 819 - 833
  • [32] KHATT: a Deep Learning Benchmark on Arabic Script
    Ahmad, Riaz
    Naz, Saeeda
    Afzal, M. Zeshan
    Rashid, S. Faisal
    Liwicki, Marcus
    Dengel, Andreas
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 7, 2017, : 10 - 14
  • [33] DOCUMENTS IN ARABIC SCRIPT AT THE MOZAMBIQUE HISTORICAL ARCHIVES
    Bonate, Liazzat J. K.
    ISLAMIC AFRICA, 2010, 1 (02): : 253 - 257
  • [34] Online versus offline Arabic script classification
    Tanzila Saba
    Abdulaziz S. Almazyad
    Amjad Rehman
    Neural Computing and Applications, 2016, 27 : 1797 - 1804
  • [35] Arabic Script based Digit Recognition Systems
    Naz, Saeeda
    Ahmed, Saad B.
    Ahmad, Riaz
    Razzak, Muhammad I.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN COMPUTER SYSTEMS, 2016, 38 : 67 - 73
  • [36] Text Normalization Method for Arabic Handwritten Script
    Abu-Ain, Tarik
    Abdullah, Siti Norul Huda Sheikh
    Omar, Khairuddin
    Abu-Ein, Ashraf
    Bataineh, Bilal
    Abu-Ain, Waleed
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2013, 7 (02) : 164 - 175
  • [37] Online versus offline Arabic script classification
    Saba, Tanzila
    Almazyad, Abdulaziz S.
    Rehman, Amjad
    NEURAL COMPUTING & APPLICATIONS, 2016, 27 (07): : 1797 - 1804
  • [38] Survey on Segmentation and Recognition of Handwritten Arabic Script
    Ali A.A.A.
    Suresha M.
    SN Computer Science, 2020, 1 (4)
  • [39] Script Identification of Multilingual Document Images Based on Block Finite Ridgelet Transform and Discrete Curvelet Transform
    Wu, Zheng-Jian
    Hasimu, Reyihanguli
    Mamat, Hoinisa
    Aysa, Alimjan
    Ubul, Kurban
    PROCEEDINGS OF 2020 2ND INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MACHINE VISION AND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND MACHINE LEARNING, IPMV 2020, 2020, : 87 - 93
  • [40] Graphemic Normalization of the Perso-Arabic Script
    Doctor, Raiomond
    Gutkin, Alexander
    Johny, Cibu
    Roark, Brian
    Sproat, Richard
    arXiv, 2022,