Cross-domain document layout analysis using document style guide

被引:0
|
作者
Wu, Xingjiao [1 ,2 ]
Xiao, Luwei [2 ,3 ]
Du, Xiangcheng [1 ,4 ]
Zheng, Yingbin [4 ]
Li, Xin
Ma, Tianlong [1 ,2 ,3 ]
Jin, Cheng [1 ]
He, Liang [2 ,3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200062, Peoples R China
[3] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China
[4] Videt Lab, Shanghai 201203, Peoples R China
关键词
Data generation; Document layout analysis; Deep learning; Document cross-domain analysis;
D O I
10.1016/j.eswa.2023.123039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document layout analysis (DLA) is a crucial computer vision task that involves partitioning document images into high-level semantic regions such as figures, tables, backgrounds, and texts. Deep learning models for DLA typically require a large amount of labeled data, which can be expensive. Though some researchers use generated data for training, a substantial style gap exists between the generated and target data. Moreover, it is necessary to improve the quality of the generated samples to achieve better control. To address these challenges, we propose a cross-domain DLA framework called DL-DSG, which leverages documentstyle guidance. DL-DSG comprises three components: the document layout generator (DLG) responsible for generating document element locations, the document element decorator (DED) for filling the elements, and the document style discriminator (DSD) for style guidance. In addition to generating controlled documents, we also focus on bridging the gap between the generated and target samples. To this end, we introduce a novel strategy that transforms document style judgment into the document cross-domain style guidance component. We evaluate the effectiveness of DL-DSG on popular DLA datasets, including PubLayNet, DSSE-200, CS-150, and CDSSE, and demonstrate its superior performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Cross-Domain Recaptured Document Detection with Texture and Reflectance Characteristics
    Yan, Jiabin
    Chen, Changsheng
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1708 - 1715
  • [2] Document Domain Randomization for Deep Learning Document Layout Extraction
    Ling, Meng
    Chen, Jian
    Moeller, Torsten
    Isenberg, Petra
    Isenberg, Tobias
    Sedlmair, Michael
    Laramee, Robert S.
    Shen, Han-Wei
    Wu, Jian
    Giles, C. Lee
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT I, 2021, 12821 : 497 - 513
  • [3] Knowledge-based Document Embedding for Cross-Domain Text Classification
    Li, Yiming
    Wei, Baogang
    Yao, Liang
    Chen, Hui
    Li, Zherong
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1395 - 1402
  • [4] Document Layout Analysis using Multigaussian Fitting
    Melinda, Laiphangbam
    Ghanapuram, Raghu
    Bhagvati, Chakravarthy
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 747 - 752
  • [5] Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval
    Yilmaz, Zeynep Akkalyoncu
    Yang, Wei
    Zhang, Haotian
    Lin, Jimmy
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3490 - 3496
  • [6] A Comparative Study of Key Phrase Extraction for Cross-Domain Document Collections
    Tantanasiriwong, Supaporn
    Haruechaiyasak, Choochart
    Guha, Sumanta
    EMERGENCE OF DIGITAL LIBRARIES - RESEARCH AND PRACTICES, 2014, 8839 : 393 - 398
  • [7] A link-bridged topic model for cross-domain document classification
    Yang, Pei
    Gao, Wei
    Tan, Qi
    Wong, Kam-Fai
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (06) : 1181 - 1193
  • [8] A Hybrid Approach for Document Layout Analysis in Document Images
    Shehzadi, Tahira
    Stricker, Didier
    Afzal, Muhammad Zeshan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 21 - 39
  • [9] A novel scheme of domain transfer in document-level cross-domain sentiment classification
    Lei, Yueting
    Li, Yanting
    JOURNAL OF INFORMATION SCIENCE, 2023, 49 (03) : 567 - 581
  • [10] Arabic document layout analysis
    Hesham, Amany M.
    Rashwan, Mohsen A. A.
    Al-Barhamtoshy, Hassanin M.
    Abdou, Sherif M.
    Badr, Amr A.
    Farag, Ibrahim
    PATTERN ANALYSIS AND APPLICATIONS, 2017, 20 (04) : 1275 - 1287