PROPER NOUN DETECTION IN DOCUMENT IMAGES

被引:7
|
作者
DESILVA, GL [1 ]
HULL, JJ [1 ]
机构
[1] CTR EXCELLENCE DOCUMENT ANAL & RECOGNIT,DEPT COMP SCI,226 BELL HALL,BUFFALO,NY 14260
关键词
PROPER NOUN DETECTION; CHARACTER RECOGNITION; WORD RECOGNITION; FEATURE EXTRACTION; CAPITALIZED WORD DETECTION; NEAREST NEIGHBOR CLASSIFIER;
D O I
10.1016/0031-3203(94)90062-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An algorithm for the detection of proper nouns in document images printed in mixed upper and lower case is presented. Analysis of graphical features of words in a running text is performed to determine words that are likely to be names of specific persons, places, or objects (i.e. proper nouns). This algorithm is a useful addition to contextual post-processing (CPP) or whole word recognition techniques where word images are matched to entries in a dictionary. Due to the difficulty of creating a comprehensive list of proper nouns, a methodology of locating such words prior to recognition will allow for the use of specialized recognition strategies for those words only. Experimental results demonstrate that about 90% of all occurrences of proper nouns were located and over 97% of the unique proper nouns in a document were found using this algorithm.
引用
收藏
页码:311 / 320
页数:10
相关论文
共 50 条
  • [31] Table detection for visually rich document images
    Xiao, Bin
    Simsek, Murat
    Kantarci, Burak
    Abu Alkheir, Ala
    KNOWLEDGE-BASED SYSTEMS, 2023, 282
  • [32] Localized Forgery Detection in Hyperspectral Document Images
    Luo, Zhipei
    Shafait, Paisal
    Mian, Ajmal
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 496 - 500
  • [33] Textline detection in degraded historical document images
    Ahn, Byeongyong
    Ryu, Jewoong
    Koo, Hyung Il
    Cho, Nam Ik
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [34] VESSELNESS FOR TEXT DETECTION IN HISTORICAL DOCUMENT IMAGES
    Hofmann, Simon
    Gropp, Martin
    Bernecker, David
    Pollin, Christopher
    Maier, Andreas
    Christlein, Vincent
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3259 - 3263
  • [35] Textline detection in degraded historical document images
    Byeongyong Ahn
    Jewoong Ryu
    Hyung Il Koo
    Nam Ik Cho
    EURASIP Journal on Image and Video Processing, 2017
  • [36] Improved Algorithm for Blob Detection in Document Images
    Swati
    Dixit, Gaurav
    2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 703 - 708
  • [37] A Dataset for Forgery Detection and Spotting in Document Images
    Sidere, Nicolas
    Cruz, Francisco
    Coustaty, Mickal
    Ogier, Jean-Marc
    2017 SEVENTH INTERNATIONAL CONFERENCE ON EMERGING SECURITY TECHNOLOGIES (EST), 2017, : 25 - 30
  • [38] Fast Document Area Detection for Scanned Images
    Kordecki, Andrzej
    ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), 2019, 11041
  • [39] Detection of Cut-And-Paste in Document Images
    Gandhi, Ankit
    Jawahar, C. V.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 653 - 657
  • [40] Confidence Estimation for Object Detection in Document Images
    Boillet, Melodie
    Kermorvant, Christopher
    Paquet, Thierry
    PATTERN RECOGNITION LETTERS, 2023, 166 : 31 - 37