GraphRevisedIE: Multimodal information extraction with graph-revised network

被引:6
|
作者
Cao, Panfeng [1 ]
Wu, Jian [2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China
关键词
Document information extraction; Graph convolutional network; Transformer; IMAGES;
D O I
10.1016/j.patcog.2023.109542
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Key information extraction (KIE) from visually rich documents (VRD) has been a challenging task in document intelligence because of not only the complicated and diverse layouts of VRD that make the model hard to generalize but also the lack of methods to exploit the multimodal features in VRD. In this paper, we propose a light-weight model named GraphRevisedIE that effectively embeds multimodal features such as textual, visual, and layout features from VRD and leverages graph revision and graph convolution to enrich the multimodal embedding with global context. Extensive experiments on multiple real-world datasets show that GraphRevisedIE generalizes to documents of varied layouts and achieves comparable or better performance compared to previous KIE methods. We also publish a business license dataset that contains both real-life and synthesized documents to facilitate research of document KIE. (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Structured Information Extraction of Pathology Reports with Attention-based Graph Convolutional Network
    Wu, Jialun
    Tang, Kaiwen
    Zhang, Haichuan
    Wang, Chunbao
    Li, Chen
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2395 - 2402
  • [22] A Biomedical Relation Extraction Method Based on Graph Convolutional Network with Dependency Information Fusion
    Yang, Wanli
    Xing, Linlin
    Zhang, Longbo
    Cai, Hongzhen
    Guo, Maozu
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [23] Multi-Grained Dependency Graph Neural Network for Chinese Open Information Extraction
    Lyu, Zhiheng
    Shi, Kaijie
    Li, Xin
    Hou, Lei
    Li, Juanzi
    Song, Binheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT III, 2021, 12714 : 155 - 167
  • [24] Multi-information interaction graph neural network for joint entity and relation extraction
    Zhang, Yini
    Zhang, Yuxuan
    Wang, Zijing
    Peng, Huanchun
    Yang, Yongsheng
    Li, Yuanxiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [25] A graph neural network based efficient firmware information extraction method for IoT devices
    Zhang, Weidong
    Li, Hong
    Wen, Hui
    Zhu, Hongsong
    Sun, Limin
    2018 IEEE 37TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2018,
  • [26] Movie Scene Event Extraction with Graph Attention Network Based on Argument Correlation Information
    Yi, Qian
    Zhang, Guixuan
    Liu, Jie
    Zhang, Shuwu
    SENSORS, 2023, 23 (04)
  • [27] Rule-based Text Extraction for Multimodal Knowledge Graph
    Norabid, Idza Aisara
    Fauzi, Fariza
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 295 - 304
  • [28] Graph Convolutional Neural Network for Multimodal Movie Recommendation
    Mondal, Prabir
    Chakder, Daipayan
    Raj, Subham
    Saha, Sriparna
    Onoe, Naoyuki
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1633 - 1640
  • [29] Multimodal heterogeneous graph convolutional network for image recommendation
    Weiyi Wei
    Jian Wang
    Mengyu Xu
    Futong Zhang
    Multimedia Systems, 2023, 29 : 2747 - 2760
  • [30] Multimodal graph neural network for video procedural captioning
    Ji, Lei
    Tu, Rongcheng
    Lin, Kevin
    Wang, Lijuan
    Duan, Nan
    NEUROCOMPUTING, 2022, 488 : 88 - 96