A fast algorithm for bottom-up document layout analysis

被引:87
|
作者
Simon, A
Pret, JC
Johnson, AP
机构
[1] Institute for Computer Applications in Molecular Sciences, School of Chemistry, University of Leeds, Leeds
关键词
document analysis; physical page layout; bottom-up layout analysis; Kruskal's algorithm; spanning tree; chemical documents;
D O I
10.1109/34.584106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new bottom-up method for document layout analysis. The algorithm was implemented in the GLIDE (Chemical Literature Data Extraction) system (http://chem.leeds.ac.uk/ICAMS/CLiDE.html) but the method described here is suitable for a broader range of documents. It is based on Kruskal's algorithm and uses a special distance-metric between the components to construct the physical page structure. The method has all the major advantages of bottom-up systems: independence from different text spacing and independence from different block alignments. The algorithms computational complexity is reduced to linear by using heuristics and path-compression.
引用
收藏
页码:273 / 277
页数:5
相关论文
共 50 条
  • [31] Towards Bottom-Up Analysis of Social Food
    Rich, Jaclyn
    Haddadi, Hamed
    Hospedales, Timothy M.
    DH'16: PROCEEDINGS OF THE 2016 DIGITAL HEALTH CONFERENCE, 2016, : 111 - 120
  • [32] Bottom-up Check and Temporal Rate Proportion based Fast InterIMV Algorithm in Versatile Video Coding
    Li, Yihang
    Zhao, Ziyan
    Liu, Qin
    Du, Songlin
    Ikenaga, Takeshi
    2020 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING AND ARTIFICIAL INTELLIGENCE, 2020, 11584
  • [33] Bottom-up excitonics
    Aspuru-Guzik, Alan
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251
  • [34] A bottom-up review
    Standing, G
    FOREIGN POLICY, 2001, (122) : 8 - +
  • [35] Bottom-Up Management
    Gordon, Paul J.
    INDUSTRIAL & LABOR RELATIONS REVIEW, 1950, 3 (04): : 620 - 621
  • [36] Bottom-Up Management
    不详
    HUMAN ORGANIZATION, 1950, 9 (01) : 38 - 38
  • [37] BOTTOM-UP DEMOCRACY
    不详
    SOCIOLOGY AND SOCIAL RESEARCH, 1955, 39 (05): : 353 - 353
  • [38] BOTTOM-UP TESTING
    MEHTA, KD
    IEEE SOFTWARE, 1990, 7 (05) : 4 - 4
  • [39] Bottom-Up Proteomics
    Armirotti, Andrea
    CURRENT ANALYTICAL CHEMISTRY, 2009, 5 (02) : 116 - 130
  • [40] BOTTOM-UP GRAPHENE
    不详
    CHEMICAL & ENGINEERING NEWS, 2009, 87 (28) : 26 - 26