Segmentation for document layout analysis: not dead yet

被引:0
|
作者
Logan Markewich
Hao Zhang
Yubin Xing
Navid Lambert-Shirzad
Zhexin Jiang
Roy Ka-Wei Lee
Zhi Li
Seok-Bum Ko
机构
[1] University of Saskatchewan,
[2] Living Sky Technologies Inc.,undefined
关键词
Computer vision; Semantic segmentation; Document layout analysis; Annotation;
D O I
暂无
中图分类号
学科分类号
摘要
Document layout analysis is often the first task in document understanding systems, where a document is broken down into identifiable sections. One of the most common approaches to this task is image segmentation, where each pixel in a document image is classified. However, this task is challenging because as the number of classes increases, small and infrequent objects often get missed. In this paper, we propose a weighted bounding box regression loss methodology to improve accuracy for segmentation of document layouts, while demonstrating our results on our dense article dataset (DAD) and the existing PubLayNet dataset. First, we collect and annotate 43 document object classes across 450 open access research articles, constructing DAD. After benchmarking several segmentation networks, we achieve an F1 score of 96.26% on DAD and 97.11% on PubLayNet with DeeplabV3+, while also showing a bounding box regression method for segmentation results that improves the F1 by +1.99 points on DAD. Finally, we demonstrate the networks trained on DAD can be used as a bootstrapped annotation tool for the existing document layout datasets, decreasing annotation time by 38% with DeeplabV3+.
引用
收藏
页码:67 / 77
页数:10
相关论文
共 50 条
  • [41] Semantic Document Layout Analysis of Handwritten Manuscripts
    Jaha, Emad Sami
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2805 - 2831
  • [42] Document Layout Analysis with Deep Learning and Heuristics
    Rezanezhad, Vahid
    Baierer, Konstantin
    Gerber, Mike
    Labusch, Kai
    Neudecker, Clemens
    PROCEEDINGS OF THE 2023 INTERNATIONAL WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING, HIP 2023, 2023, : 73 - 78
  • [43] UnSupDLA: Towards Unsupervised Document Layout Analysis
    Sheikh, Talha Uddin
    Shehzadi, Tahira
    Hashmi, Khurram Azeem
    Stricker, Didier
    Afzal, Muhammad Zeshan
    DOCUMENT ANALYSIS SYSTEMS, DAS 2024, 2024, 14994 : 142 - 161
  • [44] DOCUMENT LAYOUT ANALYSIS VIA POSITIONAL ENCODING
    Zhou, Ejian
    Wu, Xingjiao
    Xiao, Luwei
    Du, Xiangcheng
    Ma, Tianlong
    He, Liang
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1156 - 1160
  • [45] BINYAS: a complex document layout analysis system
    Bhowmik, Showmik
    Kundu, Soumyadeep
    Sarkar, Ram
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 8471 - 8504
  • [46] Document Layout Analysis for Semantic Information Extraction
    Adrian, Weronika T.
    Leone, Nicola
    Manna, Marco
    Marte, Cinzia
    AI*IA 2017 ADVANCES IN ARTIFICIAL INTELLIGENCE, 2017, 10640 : 269 - 281
  • [47] Visual similarity based document layout analysis
    Wen, Di
    Ding, Xiao-Qing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2006, 21 (03) : 459 - 465
  • [48] Document Layout Analysis using Multigaussian Fitting
    Melinda, Laiphangbam
    Ghanapuram, Raghu
    Bhagvati, Chakravarthy
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 747 - 752
  • [49] BINYAS: a complex document layout analysis system
    Showmik Bhowmik
    Soumyadeep Kundu
    Ram Sarkar
    Multimedia Tools and Applications, 2021, 80 : 8471 - 8504
  • [50] Comparative Semantic Document Layout Analysis for Enhanced Document Image Retrieval
    Jaha, Emad Sami
    IEEE ACCESS, 2024, 12 : 150451 - 150467