Automatic detection of document script and orientation

被引:0
|
作者
Lu, Shijian [1 ]
Tan, Chew Lim [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Dept Comp Sci, Singapore 117543, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an identification technique that automatically detects the underlying script and orientation of scanned document images. In the proposed technique, document script and orientation are identified by using the stroke density and distribution, which convert each document image into a document vector For each script at each orientation, a number of reference document vectors are first constructed. Script and orientation of the query document are then determined according to the similarity, between the query document vector and multiple preconstructed reference document vectors by using the K-nearest neighbor algorithm. Experiments show that the proposed technique is tolerant to the document skew and able to detect orientations of documents of different scripts.
引用
收藏
页码:237 / 241
页数:5
相关论文
共 50 条
  • [31] Automatic detection of film orientation with support vector machines
    Walsh, D
    Omlin, C
    DEVELOPMENTS IN APPLIED ARTIFICAIL INTELLIGENCE, PROCEEDINGS, 2002, 2358 : 36 - 46
  • [32] Handwritten Indic Script Identification in Multi-Script Document Images: A Survey
    Obaidullah, Sk Md
    Santosh, K. C.
    Das, Nibaran
    Halder, Chayan
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (10)
  • [33] Automatic trend detection: Time-biased document clustering
    Behpour, Sahar
    Mohammadi, Mohammadmahdi
    Albert, Mark V.
    Alam, Zinat S.
    Wang, Lingling
    Xiao, Ting
    KNOWLEDGE-BASED SYSTEMS, 2021, 220
  • [34] An automatic histogram detection and information extraction from document images
    Anagha, P. H.
    Baskar, A.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 77 - 85
  • [35] AUTOMATIC TOPIC DETECTION STRATEGY FOR INFORMATION RETRIEVAL IN SPOKEN DOCUMENT
    Jin, Shan
    Misra, Hemant
    Sikora, Thomas
    Jose, Joemon
    2009 10TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES, 2009, : 300 - +
  • [36] An automatic histogram detection and information extraction from document images
    P. H. Anagha
    A. Baskar
    International Journal of Speech Technology, 2021, 24 : 77 - 85
  • [37] Subspace Models for Document Script and Language Identification
    Vikram, T. N.
    Gowda, K. Chidananda
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2010, 20 (02) : 140 - 148
  • [38] Script and language identification from document images
    Peake, GS
    Tan, TN
    WORKSHOP ON DOCUMENT IMAGE ANALYSIS (DIA'97), PROCEEDINGS: IN COOPERATION WITH CVPR '97, 1997, : 10 - 17
  • [39] Script and language identification for handwritten document images
    Judith Hochberg
    Kevin Bowers
    Michael Cannon
    Patrick Kelly
    International Journal on Document Analysis and Recognition, 1999, 2 (2-3) : 45 - 52
  • [40] WATERMARKING BASED DOCUMENT AUTHENTICATION IN SCRIPT FORMAT
    Gonzalez-Lee, M.
    Santiago-Avila, C.
    Nakano-Miyatake, M.
    Perez-Meana, H.
    2009 52ND IEEE INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1 AND 2, 2009, : 837 - 841