End-to-end semi-supervised approach with modulated object queries for table detection in documents

被引:0
|
作者
Ehsan, Iqraa [1 ]
Shehzadi, Tahira [1 ,2 ,3 ]
Stricker, Didier [1 ,2 ,3 ]
Afzal, Muhammad Zeshan [1 ,2 ,3 ]
机构
[1] Tech Univ Kaiserslautern, Dept Comp Sci, D-67663 Kaiserslautern, Germany
[2] Tech Univ Kaiserslautern, Mindgarage, D-67663 Kaiserslautern, Germany
[3] German Res Inst Artificial Intelligence DFKI, Comp Vis, D-67663 Kaiserslautern, Germany
关键词
Table detection; Document analysis; Semi-supervised learning; Detection transformer;
D O I
10.1007/s10032-024-00471-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Table detection, a pivotal task in document analysis, aims to precisely recognize and locate tables within document images. Although deep learning has shown remarkable progress in this realm, it typically requires an extensive dataset of labeled data for proficient training. Current CNN-based semi-supervised table detection approaches use the anchor generation process and non-maximum suppression in their detection process, limiting training efficiency. Meanwhile, transformer-based semi-supervised techniques adopted a one-to-one match strategy that provides noisy pseudo-labels, limiting overall efficiency. This study presents an innovative transformer-based semi-supervised table detector. It improves the quality of pseudo-labels through a novel matching strategy combining one-to-one and one-to-many assignment techniques. This approach significantly enhances training efficiency during the early stages, ensuring superior pseudo-labels for further training. Our semi-supervised approach is comprehensively evaluated on benchmark datasets, including PubLayNet, ICADR-19, and TableBank. It achieves new state-of-the-art results, with a mAP of 95.7% and 97.9% on TableBank (word) and PubLaynet with 30% label data, marking a 7.4 and 7.6 point improvement over previous semi-supervised table detection approach, respectively. The results clearly show the superiority of our semi-supervised approach, surpassing all existing state-of-the-art methods by substantial margins. This research represents a significant advancement in semi-supervised table detection methods, offering a more efficient and accurate solution for practical document analysis tasks.
引用
收藏
页码:363 / 378
页数:16
相关论文
共 50 条
  • [41] Table Structure Recognition and Form Parsing by End-to-End Object Detection and Relation Parsing
    Li, Xiao-Hui
    Yin, Fei
    Dai, He-Sen
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2022, 132
  • [42] End-to-end Point Supervised Object Detection with low-level instance features
    Chen, Xiangqi
    Yang, Chengzhuan
    Mo, Jiashuaizi
    Jiang, Yunliang
    Zheng, Zhonglong
    APPLIED SOFT COMPUTING, 2024, 156
  • [43] AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
    Thanh-Toan Do
    Anh Nguyen
    Reid, Ian
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 5882 - 5889
  • [44] Dialect-aware Semi-supervised Learning for End-to-End Multi-dialect Speech Recognition
    Shiota, Sayaka
    Imaizumi, Ryo
    Masumura, Ryo
    Kiya, Hitoshi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 240 - 244
  • [45] Boundary-refined prototype generation: A general end-to-end paradigm for semi-supervised semantic segmentation
    Dong, Junhao
    Meng, Zhu
    Liu, Delong
    Liu, Jiaxuan
    Zhao, Zhicheng
    Su, Fei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [46] Seq3seq Fingerprint: Towards End-to-end Semi-supervised Deep Drug Discovery
    Zhang, Xiaoyu
    Wang, Sheng
    Zhu, Feiyun
    Xu, Zheng
    Wang, Yuhong
    Huang, Junzhou
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 404 - 413
  • [47] Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition
    Do, Cong-Thanh
    Doddipatla, Rama
    Hain, Thomas
    2021, arXiv
  • [48] MULTIPLE-HYPOTHESIS CTC-BASED SEMI-SUPERVISED ADAPTATION OF END-TO-END SPEECH RECOGNITION
    Do, Cong-Thanh
    Doddipatla, Rama
    Hain, Thomas
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6978 - 6982
  • [49] Semi-supervised end-to-end ASR via teacher-student learning with conditional posterior distribution
    Zhang, Zi-qiang
    Song, Yan
    Zhang, Jian-shu
    McLoughlin, Ian
    Dai, Li-Rong
    INTERSPEECH 2020, 2020, : 3580 - 3584
  • [50] Enhanced Sparse Detection for End-to-End Object Detection
    Liao, Yongwei
    Chen, Gang
    Xu, Runnan
    IEEE ACCESS, 2022, 10 : 85630 - 85640