VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction

被引:0
|
作者
Lv, Bowen [1 ,2 ,3 ,4 ]
Wu, Huarui [1 ,3 ,4 ]
Chen, Wenbai [2 ]
Chen, Cheng [1 ]
Miao, Yisheng [1 ,3 ,4 ]
Zhao, Chunjiang [1 ]
机构
[1] Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China
[3] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China
[4] Minist Agr & Rural Affairs, Key Lab Digital Village Technol, Beijing 100097, Peoples R China
关键词
Knowledge graph; Multimodal fusion; Image-text pairs; Pre-trained model;
D O I
10.1016/j.compag.2024.109398
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Knowledge graph technology is of great significance to modern agricultural information management and datadriven decision support. However, agricultural knowledge is rich in types, and agricultural knowledge graph databases built only based on text are not conducive to users' intuitive perception and comprehensive understanding of knowledge. In view of this, this paper proposes a solution to extract knowledge and construct an agricultural multimodal knowledge graph using a pre-trained language model. This paper takes two plants, cabbage and corn, as research objects. First, a text-image collaborative representation learning method with a two-stream structure is adopted to combine the image modal information of vegetables with the text modal information, and the correlation and complementarity between the two types of information are used to achieve entity alignment. In addition, in order to solve the problem of high similarity of vegetable entities in small categories, a cross-modal fine-grained contrastive learning method is introduced, and the problem of insufficient semantic association between modalities is solved by contrastive learning of vocabulary and small areas of images. Finally, a visual multimodal knowledge graph user interface is constructed using the results of image and text matching. Experimental results show that the image and text matching efficiency of the fine-tuned pretrained model on the vegetable dataset is 76.7%, and appropriate images can be matched for text entities. The constructed visual multimodal knowledge graph database allows users to query and filter knowledge according to their needs, providing assistance for subsequent research on various applications in specific fields such as multimodal agricultural intelligent question and answer, crop pest and disease identification, and agricultural product recommendations.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Academic Article Classification Algorithm Based on Pre-trained Model and Keyword Extraction
    Zhou, Zekai
    Zheng, Dongyang
    Qiu, Zihan
    Lin, Ronghua
    Wu, Zhengyang
    Yuan, Chengzhe
    COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT II, 2022, 1492 : 149 - 161
  • [22] CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation
    Gupta, Devaansh
    Kharbanda, Siddhant
    Zhou, Jiawei
    Li, Wanhua
    Pfister, Hanspeter
    Wei, Donglai
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2863 - 2874
  • [23] Speech Topic Classification Based on Pre-trained and Graph Networks
    Niu, Fangjing
    Cao, Tengfei
    Hu, Ying
    Huang, Hao
    He, Liang
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1721 - 1726
  • [24] Improving Extraction of Chinese Open Relations Using Pre-trained Language Model and Knowledge Enhancement
    Wen, Chaojie
    Jia, Xudong
    Chen, Tao
    DATA INTELLIGENCE, 2023, 5 (04) : 962 - 989
  • [25] Decoupled contrastive learning for multilingual multimodal medical pre-trained model
    Li, Qiyuan
    Qiu, Chen
    Liu, Haijiang
    Gu, Jinguang
    Luo, Dan
    NEUROCOMPUTING, 2025, 633
  • [26] Knowledge Grounded Pre-Trained Model For Dialogue Response Generation
    Wang, Yanmeng
    Rong, Wenge
    Zhang, Jianfei
    Ouyang, Yuanxin
    Xiong, Zhang
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [27] Knowledge Enhanced Pre-trained Language Model for Product Summarization
    Yin, Wenbo
    Ren, Junxiang
    Wu, Yuejiao
    Song, Ruilin
    Liu, Lang
    Cheng, Zhen
    Wang, Sibo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II, 2022, 13552 : 263 - 273
  • [28] SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models
    Wang, Liang
    Zhao, Wei
    Wei, Zhuoyu
    Liu, Jingming
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4281 - 4294
  • [29] Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces
    Nayyeri, Mojtaba
    Wang, Zihao
    Akter, Mst. Mahfuja
    Alam, Mirza Mohtashim
    Rony, Md Rashad Al Hasan
    Lehmann, Jens
    Staab, Steffen
    SEMANTIC WEB, ISWC 2023, PART I, 2023, 14265 : 388 - 407
  • [30] Assisted Process Knowledge Graph Building Using Pre-trained Language Models
    Bellan, Patrizio
    Dragoni, Mauro
    Ghidini, Chiara
    AIXIA 2022 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2023, 13796 : 60 - 74