Segmentation for document layout analysis: not dead yet

被引:0
|
作者
Logan Markewich
Hao Zhang
Yubin Xing
Navid Lambert-Shirzad
Zhexin Jiang
Roy Ka-Wei Lee
Zhi Li
Seok-Bum Ko
机构
[1] University of Saskatchewan,
[2] Living Sky Technologies Inc.,undefined
关键词
Computer vision; Semantic segmentation; Document layout analysis; Annotation;
D O I
暂无
中图分类号
学科分类号
摘要
Document layout analysis is often the first task in document understanding systems, where a document is broken down into identifiable sections. One of the most common approaches to this task is image segmentation, where each pixel in a document image is classified. However, this task is challenging because as the number of classes increases, small and infrequent objects often get missed. In this paper, we propose a weighted bounding box regression loss methodology to improve accuracy for segmentation of document layouts, while demonstrating our results on our dense article dataset (DAD) and the existing PubLayNet dataset. First, we collect and annotate 43 document object classes across 450 open access research articles, constructing DAD. After benchmarking several segmentation networks, we achieve an F1 score of 96.26% on DAD and 97.11% on PubLayNet with DeeplabV3+, while also showing a bounding box regression method for segmentation results that improves the F1 by +1.99 points on DAD. Finally, we demonstrate the networks trained on DAD can be used as a bootstrapped annotation tool for the existing document layout datasets, decreasing annotation time by 38% with DeeplabV3+.
引用
收藏
页码:67 / 77
页数:10
相关论文
共 50 条
  • [1] Segmentation for document layout analysis: not dead yet
    Markewich, Logan
    Zhang, Hao
    Xing, Yubin
    Lambert-Shirzad, Navid
    Jiang, Zhexin
    Lee, Roy Ka-Wei
    Li, Zhi
    Ko, Seok-Bum
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (02) : 67 - 77
  • [2] DOCUMENT IMAGE SEGMENTATION AND LAYOUT ANALYSIS
    SAITOH, T
    YAMAAI, T
    TACHIKAWA, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1994, E77D (07) : 778 - 784
  • [3] Document page segmentation and layout analysis using soft ordering
    Mitchell, Phillip E.
    Yan, Hong
    Proceedings - International Conference on Pattern Recognition, 2000, 1 : 458 - 461
  • [4] Document page segmentation and layout analysis using soft ordering
    Mitchell, PE
    Yan, H
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 458 - 461
  • [5] Arabic document layout analysis
    Hesham, Amany M.
    Rashwan, Mohsen A. A.
    Al-Barhamtoshy, Hassanin M.
    Abdou, Sherif M.
    Badr, Amr A.
    Farag, Ibrahim
    PATTERN ANALYSIS AND APPLICATIONS, 2017, 20 (04) : 1275 - 1287
  • [6] Arabic document layout analysis
    Amany M. Hesham
    Mohsen A. A. Rashwan
    Hassanin M. Al-Barhamtoshy
    Sherif M. Abdou
    Amr A. Badr
    Ibrahim Farag
    Pattern Analysis and Applications, 2017, 20 : 1275 - 1287
  • [7] A document straight line based segmentation for complex layout extraction
    Alheritiere, Heloise
    Cloppet, Florence
    Kurtz, Camille
    Ogier, Jean-Marc
    Vincent, Nicole
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1126 - 1131
  • [8] Simple layout segmentation of gray-scale document images
    Suvichakorn, A
    Watcharabusaracum, S
    Sinthupinyo, W
    DOCUMENT ANALYSIS SYSTEM V, PROCEEDINGS, 2002, 2423 : 245 - 248
  • [9] A Hybrid Approach for Document Layout Analysis in Document Images
    Shehzadi, Tahira
    Stricker, Didier
    Afzal, Muhammad Zeshan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 21 - 39
  • [10] Accurate Fine-Grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation
    Zhao, Penghai
    Wang, Weilan
    Cai, Zhengqi
    Zhang, Guowei
    Lu, Yuqi
    IEEE ACCESS, 2021, 9 : 154435 - 154447