VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction

被引:0
|
作者
Lv, Bowen [1 ,2 ,3 ,4 ]
Wu, Huarui [1 ,3 ,4 ]
Chen, Wenbai [2 ]
Chen, Cheng [1 ]
Miao, Yisheng [1 ,3 ,4 ]
Zhao, Chunjiang [1 ]
机构
[1] Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China
[3] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China
[4] Minist Agr & Rural Affairs, Key Lab Digital Village Technol, Beijing 100097, Peoples R China
关键词
Knowledge graph; Multimodal fusion; Image-text pairs; Pre-trained model;
D O I
10.1016/j.compag.2024.109398
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Knowledge graph technology is of great significance to modern agricultural information management and datadriven decision support. However, agricultural knowledge is rich in types, and agricultural knowledge graph databases built only based on text are not conducive to users' intuitive perception and comprehensive understanding of knowledge. In view of this, this paper proposes a solution to extract knowledge and construct an agricultural multimodal knowledge graph using a pre-trained language model. This paper takes two plants, cabbage and corn, as research objects. First, a text-image collaborative representation learning method with a two-stream structure is adopted to combine the image modal information of vegetables with the text modal information, and the correlation and complementarity between the two types of information are used to achieve entity alignment. In addition, in order to solve the problem of high similarity of vegetable entities in small categories, a cross-modal fine-grained contrastive learning method is introduced, and the problem of insufficient semantic association between modalities is solved by contrastive learning of vocabulary and small areas of images. Finally, a visual multimodal knowledge graph user interface is constructed using the results of image and text matching. Experimental results show that the image and text matching efficiency of the fine-tuned pretrained model on the vegetable dataset is 76.7%, and appropriate images can be matched for text entities. The constructed visual multimodal knowledge graph database allows users to query and filter knowledge according to their needs, providing assistance for subsequent research on various applications in specific fields such as multimodal agricultural intelligent question and answer, crop pest and disease identification, and agricultural product recommendations.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models
    He, Bin
    Zhou, Di
    Xiao, Jinghui
    Jiang, Xin
    Liu, Qun
    Yuan, Nicholas Jing
    Xu, Tong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2281 - 2290
  • [42] An empirical study of pre-trained language models in simple knowledge graph question answering
    Nan Hu
    Yike Wu
    Guilin Qi
    Dehai Min
    Jiaoyan Chen
    Jeff Z Pan
    Zafar Ali
    World Wide Web, 2023, 26 : 2855 - 2886
  • [43] KG-prompt: Interpretable knowledge graph prompt for pre-trained language models
    Chen, Liyi
    Liu, Jie
    Duan, Yutai
    Wang, Runze
    KNOWLEDGE-BASED SYSTEMS, 2025, 311
  • [44] Grounding Dialogue Systems via Knowledge Graph Aware Decoding with Pre-trained Transformers
    Chaudhuri, Debanjan
    Rony, Md Rashad Al Hasan
    Lehmann, Jens
    SEMANTIC WEB, ESWC 2021, 2021, 12731 : 323 - 339
  • [45] Using Noise and External Knowledge to Enhance Chinese Pre-trained Model
    Ma, Haoyang
    Li, Zeyu
    Guo, Hongyu
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 476 - 480
  • [46] Acquiring Knowledge from Pre-Trained Model to Neural Machine Translation
    Weng, Rongxiang
    Yu, Heng
    Huang, Shujian
    Cheng, Shanbo
    Luo, Weihua
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9266 - 9273
  • [47] KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
    Wang, Xiaozhi
    Gao, Tianyu
    Zhu, Zhaocheng
    Zhang, Zhengyan
    Liu, Zhiyuan
    Li, Juanzi
    Tang, Jian
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 176 - 194
  • [48] Explanation Guided Knowledge Distillation for Pre-trained Language Model Compression
    Yang, Zhao
    Zhang, Yuanzhe
    Sui, Dianbo
    Ju, Yiming
    Zhao, Jun
    Liu, Kang
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
  • [49] Emergency entity relationship extraction for water diversion project based on pre-trained model and multi-featured graph convolutional network
    Wang, Li Hu
    Liu, Xue Mei
    Liu, Yang
    Li, Hai Rui
    Liu, Jia Qi
    Yang, Li Bo
    PLOS ONE, 2023, 18 (10):
  • [50] Multimodal Topic and Sentiment Recognition for Chinese Data Based on Pre-trained Encoders
    Chen, Qian
    Chen, Siting
    Wu, Changli
    Peng, Jun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 323 - 334