A discrete arabic script for better automatic document understanding

被引:0
|
作者
Abuhaiba, ISI [1 ]
机构
[1] Islam Univ Gaza, Dept Elect & Comp Engn, Gaza, Israel
来源
关键词
document understanding; cursive arabic script; discrete arabic script; character segmentation; TrueType font; left white space; right white space;
D O I
暂无
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper lays the groundwork for the development of new fonts to produce discrete Arabic script, for the first time, instead of cursive Arabic script. These fonts help in automatic document understanding and can be used to print books, newspapers, periodicals, and all other printed materials. Of course, all other properties of Arabic writing system are preserved when producing such fonts. The history of Arabic calligraphy since its beginning provides a strong defense of our call to break the cursive law of Arabic script. We could develop new fonts for discrete Arabic typography such that the characters can be segmented with simple vertical white cuts. Two parameters are investigated to suit the new requirements: left and right white spaces. Nine A4 pages of Arabic script were used in our experiments to empirically determine a sufficient amount of these spaces. A font with left and right spaces of 160 FUnits each, achieved a segmentation success rate of 99.99%.
引用
收藏
页码:77 / 94
页数:18
相关论文
共 50 条
  • [41] The Aljamiado manuscripts (Spanish texts in Arabic script)
    López-Morillas, C
    AL-QANTARA, 1998, 19 (02): : 425 - 444
  • [42] Grasping a better understanding of the intrinsic dynamics of rhythmical and discrete prehension
    Button, C
    Bennett, S
    Davids, K
    JOURNAL OF MOTOR BEHAVIOR, 2001, 33 (01) : 27 - 36
  • [43] Script identification of document image analysis
    Cheng, Juan
    Ping, Xijian
    Zhou, Guanwei
    Yang, Yang
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 3, PROCEEDINGS, 2006, : 178 - +
  • [44] The Art of Assembly: Script, Platform, Document
    Dush, Lisa
    TECHNICAL COMMUNICATION QUARTERLY, 2022, 32 (04) : 395 - 410
  • [45] AUTOMATIC RECOGNITION OF PRINT AND SCRIPT
    HARMON, LD
    PROCEEDINGS OF THE IEEE, 1972, 60 (10) : 1165 - 1176
  • [46] Automatic Script Identification in the Wild
    Shi, Baoguang
    Yao, Cong
    Zhang, Chengquan
    Guo, Xiaowei
    Huang, Feiyue
    Bai, Xiang
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 531 - 535
  • [47] Automatic ground-truth generation for document image analysis and understanding
    Heroux, Pierre
    Barbu, Eugen
    Adam, Sebastien
    Trupin, Eric
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 476 - 480
  • [48] Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling
    Alami, Nabil
    Meknassi, Mohammed
    En-nahnahi, Noureddine
    El Adlouni, Yassine
    Ammor, Ouafae
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172
  • [49] Understanding the Bible : manuscript understanding and script interpretation
    Gertz, Jan Christian
    ZEITSCHRIFT FUR DIE ALTTESTAMENTLICHE WISSENSCHAFT, 2009, 121 (02): : 302 - 302
  • [50] Arabic document layout analysis
    Amany M. Hesham
    Mohsen A. A. Rashwan
    Hassanin M. Al-Barhamtoshy
    Sherif M. Abdou
    Amr A. Badr
    Ibrahim Farag
    Pattern Analysis and Applications, 2017, 20 : 1275 - 1287