Segmentation for document layout analysis: not dead yet

被引:0
|
作者
Logan Markewich
Hao Zhang
Yubin Xing
Navid Lambert-Shirzad
Zhexin Jiang
Roy Ka-Wei Lee
Zhi Li
Seok-Bum Ko
机构
[1] University of Saskatchewan,
[2] Living Sky Technologies Inc.,undefined
关键词
Computer vision; Semantic segmentation; Document layout analysis; Annotation;
D O I
暂无
中图分类号
学科分类号
摘要
Document layout analysis is often the first task in document understanding systems, where a document is broken down into identifiable sections. One of the most common approaches to this task is image segmentation, where each pixel in a document image is classified. However, this task is challenging because as the number of classes increases, small and infrequent objects often get missed. In this paper, we propose a weighted bounding box regression loss methodology to improve accuracy for segmentation of document layouts, while demonstrating our results on our dense article dataset (DAD) and the existing PubLayNet dataset. First, we collect and annotate 43 document object classes across 450 open access research articles, constructing DAD. After benchmarking several segmentation networks, we achieve an F1 score of 96.26% on DAD and 97.11% on PubLayNet with DeeplabV3+, while also showing a bounding box regression method for segmentation results that improves the F1 by +1.99 points on DAD. Finally, we demonstrate the networks trained on DAD can be used as a bootstrapped annotation tool for the existing document layout datasets, decreasing annotation time by 38% with DeeplabV3+.
引用
收藏
页码:67 / 77
页数:10
相关论文
共 50 条
  • [31] NOT DEAD YET
    GORHAM, E
    BIOSCIENCE, 1969, 19 (04) : 299 - &
  • [32] Not Dead Yet
    Kuter, Lois
    NATURAL HISTORY, 2009, 118 (03) : 42 - 42
  • [33] NOT DEAD YET
    Buscombe, Edward
    SIGHT AND SOUND, 2023, 33 (08): : 24 - 24
  • [34] 'NOT DEAD YET'
    GINSBERG, A
    AMERICAN POETRY REVIEW, 1994, 23 (03): : 4 - 4
  • [35] NOT DEAD YET
    Laguardia, Cheryl
    Suber, Peter
    LIBRARY JOURNAL, 2015, 140 (18) : 18 - 19
  • [36] Not Dead Yet
    Diamond, Betty
    ANNUAL REVIEW OF IMMUNOLOGY, 2023, 41 : 1 - 15
  • [37] Vision Grid Transformer for Document Layout Analysis
    Da, Cheng
    Luo, Chuwei
    Zheng, Qi
    Yao, Cong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19405 - 19415
  • [38] Visual Similarity Based Document Layout Analysis
    Di Wen
    Xiao-Qing Ding
    Journal of Computer Science and Technology, 2006, 21 : 459 - 465
  • [39] Document layout analysis based on emergent computation
    Ishitani, Y
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 45 - 50
  • [40] Visual Detection with Context for Document Layout Analysis
    Soto, Carlos X.
    Yoo, Shinjae
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3464 - 3470