A Framework for Agricultural Intelligent Analysis Based on a Visual Language Large Model

被引:1
|
作者
Yu, Piaofang [1 ,2 ]
Lin, Bo [1 ,2 ]
机构
[1] Zhejiang Univ, Sch Software Technol, Ningbo 315048, Peoples R China
[2] Zhejiang Univ, Binjiang Inst, Innovat Ctr Informat, Hangzhou 310053, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 18期
关键词
visual language large model; cross-modal fusion; image recognition; agricultural knowledge understanding;
D O I
10.3390/app14188350
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Smart agriculture has become an inevitable trend in the development of modern agriculture, especially promoted by the continuous progress of large language models like chat generative pre-trained transformer (ChatGPT) and general language model (ChatGLM). Although these large models perform well in general knowledge learning, they still have certain limitations and errors when facing agricultural professional knowledge about crop disease identification, growth stage judgment, and so on. Agricultural data involves images and texts and other modalities, which play an important role in agricultural production and management. In order to better learn the characteristics of different modal data in agriculture, realize cross-modal data fusion, and thus understand complex application scenarios, we propose a framework AgriVLM that uses a large amount of agricultural data to fine-tune the visual language model to analyze agricultural data. It can fuse multimodal data and provide more comprehensive agricultural decision support. Specifically, it utilizes Q-former as a bridge between an image encoder and a language model to achieve a cross-modal fusion of agricultural images and text data. Then, we apply a Low-Rank adaptive to fine-tune the language model to achieve an alignment between agricultural image features and a pre-trained language model. The experimental results prove that AgriVLM demonstrates great performance in crop disease recognition and growth stage recognition, with recognition accuracy exceeding 90%, demonstrating its capability to analyze different modalities of agricultural data.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] From Small Data Modeling to Large Language Model Screening: A Dual-Strategy Framework for Materials Intelligent Design
    Yu, Yeyong
    Xiong, Jie
    Wu, Xing
    Qian, Quan
    ADVANCED SCIENCE, 2024, 11 (45)
  • [22] Ethical Education Data Mining Framework for Analyzing and Evaluating Large Language Model-Based Conversational Intelligent Tutoring Systems for Management and Entrepreneurship Courses
    Ilagan, Joseph Benjamin R.
    Ilagan, Jose Ramon S.
    Rodrigo, Maria Mercedes T.
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 1, ICICT 2024, 2024, 1011 : 61 - 71
  • [23] Design of Agricultural Products Intelligent Transportation Logistics Freight Forecasting System Based on Large Data Analysis
    Ai, Xiao-yan
    Zhang, Yong-heng
    ADVANCED HYBRID INFORMATION PROCESSING, ADHIP 2019, PT II, 2019, 302 : 392 - 400
  • [24] Visual language framework for MSA
    Kosar, T
    Rebernak, D
    Mernik, M
    Zumer, V
    ITI 2004: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2004, : 395 - 400
  • [25] Visual large language model for wheat disease diagnosis in the wild
    Zhang, Kunpeng
    Ma, Li
    Cui, Beibei
    Li, Xin
    Zhang, Boqiang
    Xie, Na
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 227
  • [26] Large Language Model Firewall for AIGC Protection with Intelligent Detection Policy
    Huang, Tianrui
    You, Lina
    Cai, Nishui
    Huang, Ting
    2024 2ND INTERNATIONAL CONFERENCE ON MOBILE INTERNET, CLOUD COMPUTING AND INFORMATION SECURITY, MICCIS 2024, 2024, : 247 - 252
  • [27] Ontology-integrated tuning of large language model for intelligent maintenance
    Wang, Peng
    Karigiannis, John
    Gao, Robert X.
    CIRP ANNALS-MANUFACTURING TECHNOLOGY, 2024, 73 (01) : 361 - 364
  • [28] Integrating visual large language model and reasoning chain for driver behavior analysis and risk assessment
    Zhang, Kunpeng
    Wang, Shipu
    Jia, Ning
    Zhao, Liang
    Han, Chunyang
    Li, Li
    ACCIDENT ANALYSIS AND PREVENTION, 2024, 198
  • [29] LLM-CloudSec: Large Language Model Empowered Automatic and Deep Vulnerability Analysis for Intelligent Clouds
    Cao, Daipeng
    Wu, Jun
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
  • [30] Investigating the utilization and impact of large language model-based intelligent teaching assistants in flipped classrooms
    Teng, Da
    Wang, Xiangyang
    Xia, Yanwei
    Zhang, Yue
    Tang, Lulu
    Chen, Qi
    Zhang, Ruobing
    Xie, Sujin
    Yu, Weiyong
    EDUCATION AND INFORMATION TECHNOLOGIES, 2024,