Visual Similarity Based Document Layout Analysis

被引:0
|
作者
Di Wen
Xiao-Qing Ding
机构
[1] Tsinghua University,Department of Electronic Engineering & State Key Laboratory of Intelligent Technology and Systems
关键词
document layout analysis; texture analysis; dynamic clustering;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structures and skew angles. Aiming at a robust and adaptive DLA approach, the authors first manage to find a set of representative filters and statistics to characterize typical texture patterns in document images, which is through a visual similarity testing process. Texture features are then extracted from these filters and passed into a dynamic clustering procedure, which is called visual similarity clustering. Finally, text contents are located from the clustered results. Benefit from this scheme, the algorithm demonstrates strong robustness and adaptability in a wide variety of documents, which previous traditional DLA approaches do not possess.
引用
收藏
页码:459 / 465
页数:6
相关论文
共 50 条
  • [31] A Chinese Document Layout Analysis Based on Non-text Images
    Fu Xiaoling
    Li Xiaofeng
    2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 326 - 328
  • [32] Analysis of similarity measures with WordNet based text document clustering
    Sandhya, Nadella
    Govardhan, A.
    Advances in Intelligent and Soft Computing, 2012, 132 AISC : 703 - 714
  • [33] Analysis of Similarity Measures with WordNet Based Text Document Clustering
    Sandhya, Nadella
    Govardhan, A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 703 - +
  • [34] QAlayout: Question Answering Layout Based on Multimodal Attention for Visual Question Answering on Corporate Document
    Mahamoud, Ibrahim Souleiman
    Coustaty, Mickael
    Joseph, Aurelie
    d'Andecy, Vincent Poulain
    Ogier, Jean-Marc
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 659 - 673
  • [35] An Improved Cosine Similarity Algorithm Based on Document Similarity
    Lee, Ming
    Zhao, Heji
    INTERNATIONAL SYMPOSIUM ON FUZZY SYSTEMS, KNOWLEDGE DISCOVERY AND NATURAL COMPUTATION (FSKDNC 2014), 2014, : 196 - 204
  • [36] Phishing Detection: Analysis of Visual Similarity Based Approaches
    Jain, Ankit Kumar
    Gupta, B. B.
    SECURITY AND COMMUNICATION NETWORKS, 2017,
  • [37] Phishing detection: Analysis of visual similarity based approaches
    Jain, Ankit Kumar
    Gupta, B.B.
    Security and Communication Networks, 2017, 2017
  • [38] Vision Grid Transformer for Document Layout Analysis
    Da, Cheng
    Luo, Chuwei
    Zheng, Qi
    Yao, Cong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19405 - 19415
  • [39] Segmentation for document layout analysis: not dead yet
    Logan Markewich
    Hao Zhang
    Yubin Xing
    Navid Lambert-Shirzad
    Zhexin Jiang
    Roy Ka-Wei Lee
    Zhi Li
    Seok-Bum Ko
    International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 67 - 77
  • [40] BINYAS: a complex document layout analysis system
    Bhowmik, Showmik
    Kundu, Soumyadeep
    Sarkar, Ram
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 8471 - 8504