A discrete arabic script for better automatic document understanding

被引:0
|
作者
Abuhaiba, ISI [1 ]
机构
[1] Islam Univ Gaza, Dept Elect & Comp Engn, Gaza, Israel
来源
关键词
document understanding; cursive arabic script; discrete arabic script; character segmentation; TrueType font; left white space; right white space;
D O I
暂无
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper lays the groundwork for the development of new fonts to produce discrete Arabic script, for the first time, instead of cursive Arabic script. These fonts help in automatic document understanding and can be used to print books, newspapers, periodicals, and all other printed materials. Of course, all other properties of Arabic writing system are preserved when producing such fonts. The history of Arabic calligraphy since its beginning provides a strong defense of our call to break the cursive law of Arabic script. We could develop new fonts for discrete Arabic typography such that the characters can be segmented with simple vertical white cuts. Two parameters are investigated to suit the new requirements: left and right white spaces. Nine A4 pages of Arabic script were used in our experiments to empirically determine a sufficient amount of these spaces. A font with left and right spaces of 160 FUnits each, achieved a segmentation success rate of 99.99%.
引用
收藏
页码:77 / 94
页数:18
相关论文
共 50 条
  • [11] FULFULDE LITERATURE IN ARABIC SCRIPT
    ROBINSON, D
    HISTORY IN AFRICA, 1982, 9 : 251 - 261
  • [12] An Arabic Script Recognition System
    Alginahi, Yasser M.
    Mudassar, Mohammed
    Kabir, Muhammad Nomani
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (09): : 3701 - 3720
  • [13] Arabic Script for Students of Swahili
    Hollingsworth, L. W.
    AFRICA, 1946, 16 (04): : 276 - 277
  • [14] THE TYPE AND SPREAD OF ARABIC SCRIPT
    Daniels, Peter T.
    ARABIC SCRIPT IN AFRICA: STUDIES IN THE USE OF A WRITING SYSTEM, 2014, 71 : 25 - 39
  • [15] Segmentation of Arabic cursive script
    Motawa, D
    Amin, A
    Sabourin, R
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 625 - 628
  • [16] Towards Automatic Image Annotation Supporting Document Understanding
    Markowska-Kaczmar, Urszula
    Minda, Pawel
    Ociepa, Krzysztof
    Olszowy, Dariusz
    Pawlikowski, Roman
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PART I, 2011, 6678 : 420 - 427
  • [17] Automatic script identification from document images using cluster-based templates
    Hochberg, J
    Kelly, P
    Thomas, T
    Kerns, L
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (02) : 176 - 181
  • [18] The use of the Arabic script in northern Mozambique
    Bonate, Liazzat J. K.
    TYDSKRIF VIR LETTERKUNDE, 2008, 45 (01) : 133 - 142
  • [19] Printed Arabic Script Recognition: A Survey
    Alghamdi, Mansoor
    Teahan, William
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 415 - 428
  • [20] THE ARABIC SCRIPT IN AFRICA: UNDERSTUDIED LITERACY
    Mumin, Meikal
    ARABIC SCRIPT IN AFRICA: STUDIES IN THE USE OF A WRITING SYSTEM, 2014, 71 : 41 - 76