CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution

被引:19
|
作者
Hashmi, Khurram Azeem [1 ,2 ,3 ]
Pagani, Alain [3 ]
Liwicki, Marcus [4 ]
Stricker, Didier [1 ,3 ]
Afzal, Muhammad Zeshan [1 ,2 ,3 ]
机构
[1] Tech Univ Kaiserslautern, Dept Comp Sci, D-67663 Kaiserslautern, Germany
[2] Tech Univ Kaiserslautern, Mindgarage, D-67663 Kaiserslautern, Germany
[3] German Res Inst Artificial Intelligence DFKI, D-67663 Kaiserslautern, Germany
[4] Lulea Univ Technol, Dept Comp Sci, S-97187 Lulea, Sweden
关键词
table detection; table recognition; cascade Mask R-CNN; atrous convolution; recursive feature pyramid networks; document image analysis; deep neural networks; computer vision; object detection; RECOGNITION;
D O I
10.3390/jimaging7100214
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Table detection is a preliminary step in extracting reliable information from tables in scanned document images. We present CasTabDetectoRS, a novel end-to-end trainable table detection framework that operates on Cascade Mask R-CNN, including Recursive Feature Pyramid network and Switchable Atrous Convolution in the existing backbone architecture. By utilizing a comparativelyightweight backbone of ResNet-50, this paper demonstrates that superior results are attainable without relying on pre- and post-processing methods, heavier backbone networks (ResNet-101, ResNeXt-152), and memory-intensive deformable convolutions. We evaluate the proposed approach on five different publicly available table detection datasets. Our CasTabDetectoRS outperforms the previous state-of-the-art results on four datasets (ICDAR-19, TableBank, UNLV, and Marmot) and accomplishes comparable results on ICDAR-17 POD. Upon comparing with previous state-of-the-art results, we obtain a significant relative error reduction of 56.36%, 20%, 4.5%, and 3.5% on the datasets of ICDAR-19, TableBank, UNLV, and Marmot, respectively. Furthermore, this paper sets a new benchmark by performing exhaustive cross-datasets evaluations to exhibit the generalization capabilities of the proposed method.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
    Qiao, Siyuan
    Chen, Liang-Chieh
    Yuille, Alan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10208 - 10219
  • [2] Pulmonary nodule detection based on Hierarchical-Split HRNet and feature pyramid network with atrous convolution
    Zhu, Ling
    Zhu, Hongqing
    Yang, Suyi
    Wang, Pengyu
    Huang, Hui
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [3] CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
    Agarwal, Madhav
    Mondal, Ajoy
    Jawahar, C., V
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9491 - 9498
  • [4] Multi-layer Feature Fusion Network with Atrous Convolution for Pedestrian Detection
    Li, You
    Zhang, Qingxuan
    Zhang, Yulei
    2019 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, AUTOMATION AND CONTROL TECHNOLOGIES (AIACT 2019), 2019, 1267
  • [5] ARFP: A novel adaptive recursive feature pyramid for object detection in aerial images
    Junjie Wang
    Jiong Yu
    Zhu He
    Applied Intelligence, 2022, 52 : 12844 - 12859
  • [6] AFPNet: A 3D fully convolutional neural network with atrous-convolution feature pyramid for brain tumor segmentation via MRI images
    Zhou, Zexun
    He, Zhongshi
    Jia, Yuanyuan
    NEUROCOMPUTING, 2020, 402 : 235 - 244
  • [7] ARFP: A novel adaptive recursive feature pyramid for object detection in aerial images
    Wang, Junjie
    Yu, Jiong
    He, Zhu
    APPLIED INTELLIGENCE, 2022, 52 (11) : 12844 - 12859
  • [8] Salient Feature Pyramid Network for Ship Detection in SAR Images
    Tang, Yu
    Wang, Shigang
    Wei, Jian
    Zhao, Yan
    Lin, Jiehua
    IEEE SENSORS JOURNAL, 2024, 24 (03) : 3036 - 3045
  • [9] Dense Feature Pyramid Network for Ship Detection in SAR Images
    Hu, Weihua
    Tian, Zhuangzhuang
    Chen, Shiqi
    Zhan, Ronghui
    Zhang, Jun
    2020 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING AND ARTIFICIAL INTELLIGENCE, 2020, 11584
  • [10] Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs
    Fu, Lei
    Gu, Wen-bin
    Li, Wei
    Chen, Liang
    Ai, Yong-bao
    Wang, Hua-lei
    DEFENCE TECHNOLOGY, 2021, 17 (04) : 1531 - 1541