SGFNet: A semantic graph-based multimodal network for financial invoice information extraction

被引:0
|
作者
Luo, Shun [1 ]
Yu, Juan [1 ]
机构
[1] Fuzhou Univ, Sch Econ & Management, 2 Wulongjiang North Ave, Fuzhou 350108, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Invoice information extraction; Semantic graph; Multimodal modeling;
D O I
10.1016/j.eswa.2024.125156
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To meet the demand for a large amount of invoice entry work in the financial industry and improve the low accuracy of traditional manual entry, we construct SGFNet, a financial invoice information extraction network that integrates semantic graph associations and multimodal modeling. First, we construct a graph of strong and weak semantic associations between data within each modality based on the correlation of text content. Subsequently, we model the multimodal data in a unified structure, extract the text modal information of invoices along with corresponding image and layout modal information, and guide the fusion and embedding of multimodal data through semantic associations in the graph to produce a richer feature representation. Furthermore, semantically linked multimodal information is fed into an aggregated multimodal self-attention mechanism to establish effective connection between modalities. Finally, with the combination of supervised contrastive learning and smoothed Kullback-Leibler divergence in terms of loss functions, the accuracy degradation problem incurred by sample imbalance and convergence instability is reduced. In our experiments, we achieved F1 scores of 93.71% for the English financial invoice dataset and 96.27% for the Chinese dataset, indicating that the proposed method successfully extracts feature information from different data modalities, thereby achieving satisfactory information extraction results.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Graph-based Partitioning of Ontology with Semantic Similarity
    Ghafourian, Soudabeh
    Rezaeian, Amin
    Naghibzadeh, Mahmoud
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE 2013), 2013, : 80 - 85
  • [32] Graph-Based Taxonomic Semantic Class Labeling
    Kirigin, Tajana Ban
    Bujacic Babic, Sanda
    Perak, Benedikt
    FUTURE INTERNET, 2022, 14 (12):
  • [33] A semantic graph-based approach to biomedical summarisation
    Plaza, Laura
    Diaz, Alberto
    Gervas, Pablo
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2011, 53 (01) : 1 - 14
  • [34] Graph-based Arabic text semantic representation
    Etaiwi, Wael
    Awajan, Arafat
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
  • [35] Graph-based automatic acquisition of semantic classes
    Wu, Yunfang
    Shi, Jing
    Jin, Peng
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2011, 48 (04): : 610 - 616
  • [36] GISNet:Graph-Based Information Sharing Network For Vehicle Trajectory Prediction
    Zhao, Ziyi
    Fang, Haowen
    Jin, Zhao
    Qiu, Qinru
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [37] MIVCN: Multimodal interaction video captioning network based on semantic association graph
    Wang, Ying
    Huang, Guoheng
    Lin Yuming
    Yuan, Haoliang
    Pun, Chi-Man
    Ling, Wing-Kuen
    Cheng, Lianglun
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5241 - 5260
  • [38] MIVCN: Multimodal interaction video captioning network based on semantic association graph
    Ying Wang
    Guoheng Huang
    Lin Yuming
    Haoliang Yuan
    Chi-Man Pun
    Wing-Kuen Ling
    Lianglun Cheng
    Applied Intelligence, 2022, 52 : 5241 - 5260
  • [39] G2SAM: Graph-Based Global Semantic Awareness Method for Multimodal Sarcasm Detection
    Wei, Yiwei
    Yuan, Shaozu
    Zhou, Hengyang
    Wang, Longbiao
    Yan, Zhiling
    Yang, Ruosong
    Chen, Meng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9151 - 9159
  • [40] A graph-based information retrieval system
    Thammasut, Duangjai
    Sornil, Ohm
    2006 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES,VOLS 1-3, 2006, : 793 - +