Visual Similarity Based Document Layout Analysis

被引:0
|
作者
Di Wen
Xiao-Qing Ding
机构
[1] Tsinghua University,Department of Electronic Engineering & State Key Laboratory of Intelligent Technology and Systems
关键词
document layout analysis; texture analysis; dynamic clustering;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structures and skew angles. Aiming at a robust and adaptive DLA approach, the authors first manage to find a set of representative filters and statistics to characterize typical texture patterns in document images, which is through a visual similarity testing process. Texture features are then extracted from these filters and passed into a dynamic clustering procedure, which is called visual similarity clustering. Finally, text contents are located from the clustered results. Benefit from this scheme, the algorithm demonstrates strong robustness and adaptability in a wide variety of documents, which previous traditional DLA approaches do not possess.
引用
收藏
页码:459 / 465
页数:6
相关论文
共 50 条
  • [1] Visual similarity based document layout analysis
    Wen, Di
    Ding, Xiao-Qing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2006, 21 (03) : 459 - 465
  • [2] Classification of document page images based on visual similarity of layout structures
    Shin, CK
    Doermann, DS
    DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 182 - 190
  • [3] Document page similarity based on layout visual saliency: Application to query by example and document classification
    Eglin, V
    Bres, S
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1208 - 1212
  • [4] Retrieval of document images based on page layout similarity
    Naveen
    Guru, D. S.
    ADAPTIVE MULTIMEDIA RETRIEVAL: USER, CONTEXT, AND FEEDBACK, 2007, 4398 : 136 - +
  • [5] Visual Detection with Context for Document Layout Analysis
    Soto, Carlos X.
    Yoo, Shinjae
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3464 - 3470
  • [6] Document Visual Similarity Measure For Document Search
    Ahmadullin, Ildus
    Allebach, Jan P.
    Damera-Venkata, Niranjan
    Fan, Jian
    Lee, Seungyon
    Lin, Qian
    Liu, Jerry
    DOCENG 2011: PROCEEDINGS OF THE 2011 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2011, : 139 - 142
  • [7] Document layout analysis based on emergent computation
    Ishitani, Y
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 45 - 50
  • [8] Document page image classification based on similarity of visual appearance
    Shin, Christian
    Doermann, David
    PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2006, : 145 - +
  • [9] Chinese document layout analysis based on texture features
    Wang, Y
    Tian, XD
    Guo, BL
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1722 - 1725
  • [10] Fast CNN-based document layout analysis
    Borges Oliveria, Dario Augusto
    Viana, Matheus Palhares
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 1173 - 1180