Effective Multi-Species weed detection in complex wheat fields using Multi-Modal and Multi-View image fusion

被引:1
|
作者
Xu, Ke [1 ,2 ,3 ,4 ,5 ]
Xie, Qi [1 ,3 ,4 ,5 ]
Zhu, Yan [1 ,3 ,4 ,5 ]
Cao, Weixing [1 ]
Ni, Jun [1 ,3 ,4 ,5 ]
机构
[1] Nanjing Agr Univ, Coll Agr, Nanjing 210095, Peoples R China
[2] Anhui Polytech Univ, Coll Integrated Circuits, Wuhu 241000, Peoples R China
[3] Natl Engn & Technol Ctr Informat Agr, Nanjing 210095, Peoples R China
[4] Minist Educ, Engn Res Ctr Smart Agr, Nanjing 210095, Peoples R China
[5] Collaborat Innovat Ctr Modern Crop Prod Cosponsore, Nanjing 210095, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal image; Multi-view image; Weed detection; Deep learning; Complex wheat fields; CLASSIFICATION; SEGMENTATION; CAMERA; IDENTIFICATION; VEGETATION; INDEXES;
D O I
10.1016/j.compag.2025.109924
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Rapid and accurate acquisition of weed information in wheat fields is the precondition and key to precision weeding. Weed detection in open and complex wheat fields faces two challenges: 1) detection of grass weeds that have a similar appearance to wheat; and 2) weed detection under leaf occlusion. Dual-modal information, namely, red-green-blue (RGB) images and depth images can be introduced, which fundamentally overcomes limitations of single-modal image features in identifying grass weeds. Then, a dual-path Swin Transformer model was developed for multi-modal feature extraction. An alignment and attention module (AAM) was designed to realize the alignment and fusion of information in different modalities. Finally, multi-view images were introduced to break the limitation of leaf occlusion in natural wheat fields by completing the feature space of objects, and the multi-view information fusion method based on the common underlying principle of view agreement was proposed. The experimental results demonstrate that depth information is a valuable complement to RGB imagery, facilitating the detection of grass weeds and significantly enhancing weed detection accuracy in wheat fields. Compared with Deep Convolutional Neural Network (DCNN) model, dual-path Swin Transformer model and the model containing AAM have resulted in an improvement in weed detection accuracy by 6.6% and 11.03%, respectively. Additionally, incorporating multi-view information has effectively addressed the issue of leaf occlusion in weed detection, resulting in an 85.14% increase in weed detection accuracy. Furthermore, the problem of view divergence in multi-view learning has been resolved, reducing the false detection rate by 23.94% compared to the direct Combination method.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] STOCHASTIC FUSION OF MULTI-VIEW GRADIENT FIELDS
    Sankaranarayanan, Aswin C.
    Chellappa, Rama
    2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5, 2008, : 1324 - 1327
  • [32] Multi-layer, multi-modal medical image intelligent fusion
    Nair, Rekha R.
    Singh, Tripty
    Basavapattana, Abhinandan
    Pawar, Manasa M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42821 - 42847
  • [33] Multi-layer, multi-modal medical image intelligent fusion
    Rekha R. Nair
    Tripty Singh
    Abhinandan Basavapattana
    Manasa M. Pawar
    Multimedia Tools and Applications, 2022, 81 : 42821 - 42847
  • [34] Multi-view key information representation and multi-modal fusion for single-subject routine action recognition
    Chao, Xin
    Ji, Genlin
    Qi, Xiaosha
    APPLIED INTELLIGENCE, 2024, 54 (04) : 3222 - 3244
  • [35] Multi-view key information representation and multi-modal fusion for single-subject routine action recognition
    Xin Chao
    Genlin Ji
    Xiaosha Qi
    Applied Intelligence, 2024, 54 : 3222 - 3244
  • [36] PhotoMatch: An Open-Source Tool for Multi-View and Multi-Modal Feature-Based Image Matching
    de Ona, Esteban Ruiz
    Barbero-Garcia, Ines
    Gonzalez-Aguilera, Diego
    Remondino, Fabio
    Rodriguez-Gonzalvez, Pablo
    Hernandez-Lopez, David
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [37] Driver drowsiness detection using multi-modal sensor fusion
    Andreeva, E
    Aarabi, P
    Philiastides, MG
    Mohajer, K
    Emami, M
    MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATONS 2004, 2004, 5434 : 380 - 390
  • [38] A Multi-view Fusion Method for Image Retrieval
    Zhang, Yang-ping
    Zhang, Shi-bo
    Yan, Yuan-ting
    2016 9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016), 2016, : 379 - 383
  • [39] Self-supervised multi-modal fusion network for multi-modal thyroid ultrasound image diagnosis
    Xiang, Zhuo
    Zhuo, Qiuluan
    Zhao, Cheng
    Deng, Xiaofei
    Zhu, Ting
    Wang, Tianfu
    Jiang, Wei
    Lei, Baiying
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [40] Multi-Modal Fusion for Moving Object Detection in Static and Complex Backgrounds
    Jiang, Huali
    Li, Xin
    TRAITEMENT DU SIGNAL, 2023, 40 (05) : 1941 - 1950