Eigenspace method for text retrieval in historical document images

被引:0
|
作者
Terasawa, K [1 ]
Nagasaki, T [1 ]
Kawashima, T [1 ]
机构
[1] Future Univ Hakodate, Sch Syst Informat Sci, Hakodate, Hokkaido 0418655, Japan
来源
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS | 2005年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new method for text retrieval that does not need segmentation is described. Segmenting the images in historical documents into individual characters is difficult. Therefore, the conventional OCR method, which uses segmentation, does not work well. Our method instead divides the text image into a sequence of small slits. The image region that corresponds to the query image region is retrieved by solving the matching problem of these sequences. Applying the eigenspace method to the slit images enables us to solve the matching problem efficiently. Moreover using dynamic time warping (DTW) further improves the results. Our method has higher accuracy than the simple template matching method, and it has far higher efficiency in computational cost.
引用
收藏
页码:437 / 441
页数:5
相关论文
共 50 条
  • [31] Integrating text retrieval and image retrieval in XML document searching
    Tjondronegoro, D.
    Zhang, J.
    Gu, J.
    Nguyen, A.
    Geva, S.
    ADVANCES IN XML INFORMATION RETRIEVAL AND EVALUATION, 2006, 3977 : 511 - 524
  • [32] Document retrieval from compressed images
    Lu, Y
    Tan, CL
    PATTERN RECOGNITION, 2003, 36 (04) : 987 - 996
  • [33] The retrieval of document images: A brief survey
    Doermann, D
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 945 - 949
  • [34] The indexing and retrieval of document images: A survey
    Doermann, D
    COMPUTER VISION AND IMAGE UNDERSTANDING, 1998, 70 (03) : 287 - 298
  • [35] Arabic Document Indexing for Improved Text Retrieval
    Al-Lahham, Yaser A. M.
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 226 - 230
  • [36] Imaged document text retrieval without OCR
    Tan, CL
    Huang, WH
    Yu, ZH
    Xu, Y
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (06) : 838 - 844
  • [37] Text Line Extraction in Document Images
    Wang, Liuan
    Fan, Wei
    Sun, Jun
    Naoi, Satshi
    Tanaka, Hiroshi
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 191 - 195
  • [38] A new efficient binarization method: application to degraded historical document images
    Zineb Hadjadj
    Mohamed Cheriet
    Abdelkrim Meziane
    Yazid Cherfa
    Signal, Image and Video Processing, 2017, 11 : 1155 - 1162
  • [39] A new efficient binarization method: application to degraded historical document images
    Hadjadj, Zineb
    Cheriet, Mohamed
    Meziane, Abdelkrim
    Cherfa, Yazid
    SIGNAL IMAGE AND VIDEO PROCESSING, 2017, 11 (06) : 1155 - 1162
  • [40] Interactive text retrieval based on document similarities
    Klose, A
    Nürnberger, A
    Kruse, R
    Hartmann, G
    Richards, M
    PHYSICS AND CHEMISTRY OF THE EARTH PART A-SOLID EARTH AND GEODESY, 2000, 25 (08): : 649 - 654