Eigenspace method for text retrieval in historical document images

被引:0
|
作者
Terasawa, K [1 ]
Nagasaki, T [1 ]
Kawashima, T [1 ]
机构
[1] Future Univ Hakodate, Sch Syst Informat Sci, Hakodate, Hokkaido 0418655, Japan
来源
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS | 2005年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new method for text retrieval that does not need segmentation is described. Segmenting the images in historical documents into individual characters is difficult. Therefore, the conventional OCR method, which uses segmentation, does not work well. Our method instead divides the text image into a sequence of small slits. The image region that corresponds to the query image region is retrieved by solving the matching problem of these sequences. Applying the eigenspace method to the slit images enables us to solve the matching problem efficiently. Moreover using dynamic time warping (DTW) further improves the results. Our method has higher accuracy than the simple template matching method, and it has far higher efficiency in computational cost.
引用
收藏
页码:437 / 441
页数:5
相关论文
共 50 条
  • [21] Text Line Extraction for Historical Document Images using Steerable Directional Filters
    Alaql, Omar
    Lu, Cheng Chang
    2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 312 - 317
  • [22] A Hybrid Method for Text Line Extraction in Handwritten Document Images
    Kiumarsi, Ehsan
    Alaei, Alireza
    PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 241 - 246
  • [23] Text detection method in document images based on multiresolution analysis
    Lee, Geum-Boon
    Shin, Dong-Guk
    Cho, Beom-Joon
    WMSCI 2007 : 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL V, POST CONFERENCE ISSUE, PROCEEDINGS, 2007, : 200 - +
  • [24] Text Separation in Document Images through Otsu's Method
    Sindhuri, M. Siva
    Anusha, N.
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2395 - 2399
  • [25] HPSegNet: A Method for Handwritten and Printed Text Separation in Document Images
    Chao, Yu
    Liu, Changsong
    Peng, Liangrui
    Wang, Yanwei
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024 WORKSHOPS, PT II, 2024, 14936 : 184 - 198
  • [26] VECTOR IMAGES IN DOCUMENT RETRIEVAL
    SWITZER, P
    STATISTICAL ASSOCIATION METHODS FOR MECHANIZED DOCUMENTATION SYMPOSIUM PROCEEDINGS, 1965, 1964 (NBS26): : 163 - &
  • [27] Implementation of LSI Method on Information Retrieval for Text Document in Bahasa Indonesia
    Pardede, Jasman
    Barmawi, Mira Musrini
    INTERNETWORKING INDONESIA, 2016, 8 (01): : 83 - 87
  • [28] Information retrieval beyond the text document
    Rui, Y
    Ortega, M
    Huang, TS
    Mehrotra, S
    LIBRARY TRENDS, 1999, 48 (02) : 455 - 474
  • [29] A Generic Image Retrieval Method for Date Estimation of Historical Document Collections
    Molina, Adria
    Gomez, Lluis
    Ramos Terrades, Oriol
    Llados, Josep
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 583 - 597
  • [30] Document Indexing Framework for Retrieval of Degraded Document Images
    Garg, Ritu
    Hassan, Ehtesham
    Chaudhury, Santanu
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1261 - 1265