Automatic detection of document script and orientation

被引:0
|
作者
Lu, Shijian [1 ]
Tan, Chew Lim [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Dept Comp Sci, Singapore 117543, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an identification technique that automatically detects the underlying script and orientation of scanned document images. In the proposed technique, document script and orientation are identified by using the stroke density and distribution, which convert each document image into a document vector For each script at each orientation, a number of reference document vectors are first constructed. Script and orientation of the query document are then determined according to the similarity, between the query document vector and multiple preconstructed reference document vectors by using the K-nearest neighbor algorithm. Experiments show that the proposed technique is tolerant to the document skew and able to detect orientations of documents of different scripts.
引用
收藏
页码:237 / 241
页数:5
相关论文
共 50 条
  • [41] Determination of the script and language content of document images
    Spitz, AL
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (03) : 235 - 245
  • [42] Trilingual Script Separation of Handwritten Postal Document
    Roy, K.
    Majumder, K.
    SIXTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS & IMAGE PROCESSING ICVGIP 2008, 2008, : 693 - +
  • [43] Appearance based models in document script identification
    Vikram, T. N.
    Guru, D. S.
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 709 - +
  • [44] Classification of the Bangla Script Document using SVM
    Shukla, Manoj Kumar
    Rana, Ajay
    Banka, Haider
    2016 3RD INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN INFORMATION TECHNOLOGY (RAIT), 2016, : 182 - 185
  • [45] Categorizing document images into script and language classes
    Suen, CY
    Bergler, S
    Nobile, N
    Waked, B
    Nadal, CP
    Bloch, A
    INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 1999, : 297 - 306
  • [46] Identification of Kashmiri Script in a Bilingual Document Image
    Bashir, Rumaan
    Quadri, Smk
    2013 IEEE SECOND INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2013, : 575 - 579
  • [47] Script Identification for a Tri-lingual Document
    Aithal, Prakash K.
    Rajesh, G.
    Acharya, Dinesh U.
    Krishnamoorthi, M.
    Subbareddy, N. V.
    COMPUTER NETWORKS AND INFORMATION TECHNOLOGIES, 2011, 142 : 434 - +
  • [48] Automatic recognition of printed Oriya script
    Chaudhuri, BB
    Pal, U
    Mitra, M
    SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 795 - 799
  • [49] Automatic Tibetan script recognition by computer
    Kojima, M
    Kawazoe, Y
    Kimura, T
    Kimura, M
    TIBETAN STUDIES, VOLS 1 AND 2, 1997, : 527 - 533
  • [50] Automatic recognition of printed Oriya script
    Chaudhuri, B.B.
    Pal, U.
    Mitra, M.
    Sadhana - Academy Proceedings in Engineering Sciences, 2002, 27 (01) : 23 - 34