PROPER NOUN DETECTION IN DOCUMENT IMAGES

被引:7
|
作者
DESILVA, GL [1 ]
HULL, JJ [1 ]
机构
[1] CTR EXCELLENCE DOCUMENT ANAL & RECOGNIT,DEPT COMP SCI,226 BELL HALL,BUFFALO,NY 14260
关键词
PROPER NOUN DETECTION; CHARACTER RECOGNITION; WORD RECOGNITION; FEATURE EXTRACTION; CAPITALIZED WORD DETECTION; NEAREST NEIGHBOR CLASSIFIER;
D O I
10.1016/0031-3203(94)90062-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An algorithm for the detection of proper nouns in document images printed in mixed upper and lower case is presented. Analysis of graphical features of words in a running text is performed to determine words that are likely to be names of specific persons, places, or objects (i.e. proper nouns). This algorithm is a useful addition to contextual post-processing (CPP) or whole word recognition techniques where word images are matched to entries in a dictionary. Due to the difficulty of creating a comprehensive list of proper nouns, a methodology of locating such words prior to recognition will allow for the use of specialized recognition strategies for those words only. Experimental results demonstrate that about 90% of all occurrences of proper nouns were located and over 97% of the unique proper nouns in a document were found using this algorithm.
引用
收藏
页码:311 / 320
页数:10
相关论文
共 50 条
  • [21] Fast Glare Detection in Document Images
    Rodin, Dmitry
    Orlov, Nikita
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND WORKSHOP ON INDUSTRIAL APPLICATIONS OF DOCUMENT ANALYSIS AND RECOGNITION, VOL 7, 2019, : 6 - 9
  • [22] On the proper treatment of noun-noun metaphor: A critique of the Sapper model
    Ferguson, RW
    Forbus, KD
    Gentner, D
    PROCEEDINGS OF THE NINETEENTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 1997, : 913 - 913
  • [23] Korean Dependency Parsing with Proper Noun Encodin
    Nam, Gyu-Hyeon
    Lee, Hyun-Young
    Kang, Seung-Shik
    PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY (ICIIT 2019), 2019, : 113 - 117
  • [24] Tokenization and proper noun recognition for information retrieval
    Barcala, FM
    Vilares, J
    Alonso, MA
    Graña, J
    Vilares, M
    13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2002, : 246 - 250
  • [25] The subject: Unconscious, origin, enunciation or the proper noun
    Garand, D
    UNIVERSITY OF TORONTO QUARTERLY, 1998, 68 (01) : 106 - 110
  • [26] An Evaluation of Table Detection Methods in Document Images
    Alexiou, Michail S.
    Petrakis, Euripides G. M.
    Bourbakis, Nikolaos G.
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 54 - 63
  • [27] Continual Learning for Table Detection in Document Images
    Minouei, Mohammad
    Hashmi, Khurram Azeem
    Soheili, Mohammad Reza
    Afzal, Muhammad Zeshan
    Stricker, Didier
    APPLIED SCIENCES-BASEL, 2022, 12 (18):
  • [28] Grammar of the proper noun - French - GaryPrieur,MN
    Wiederspiel, B
    FRANCAIS MODERNE, 1997, 65 (02): : 217 - 220
  • [29] Mathematical Formula Detection in Heterogeneous Document Images
    Chu, Wei-Ta
    Liu, Fan
    2013 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2013, : 140 - 145
  • [30] Detection of Variable Regions in Complex Document Images
    Sreelakshmi, U. K.
    Akash, V. G.
    Rani, N. Shobha
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 807 - 811