Effective Multi-Species weed detection in complex wheat fields using Multi-Modal and Multi-View image fusion

被引：1

作者：

Xu, Ke ^{[1
,2
,3
,4
,5
]}

Xie, Qi ^{[1
,3
,4
,5
]}

Zhu, Yan ^{[1
,3
,4
,5
]}

Cao, Weixing ^{[1
]}

Ni, Jun ^{[1
,3
,4
,5
]}

机构：

[1] Nanjing Agr Univ, Coll Agr, Nanjing 210095, Peoples R China

[2] Anhui Polytech Univ, Coll Integrated Circuits, Wuhu 241000, Peoples R China

[3] Natl Engn & Technol Ctr Informat Agr, Nanjing 210095, Peoples R China

[4] Minist Educ, Engn Res Ctr Smart Agr, Nanjing 210095, Peoples R China

[5] Collaborat Innovat Ctr Modern Crop Prod Cosponsore, Nanjing 210095, Peoples R China

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2025年 / 230卷

基金：

中国国家自然科学基金;

关键词：

Multi-modal image; Multi-view image; Weed detection; Deep learning; Complex wheat fields; CLASSIFICATION; SEGMENTATION; CAMERA; IDENTIFICATION; VEGETATION; INDEXES;

D O I：

10.1016/j.compag.2025.109924

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

Rapid and accurate acquisition of weed information in wheat fields is the precondition and key to precision weeding. Weed detection in open and complex wheat fields faces two challenges: 1) detection of grass weeds that have a similar appearance to wheat; and 2) weed detection under leaf occlusion. Dual-modal information, namely, red-green-blue (RGB) images and depth images can be introduced, which fundamentally overcomes limitations of single-modal image features in identifying grass weeds. Then, a dual-path Swin Transformer model was developed for multi-modal feature extraction. An alignment and attention module (AAM) was designed to realize the alignment and fusion of information in different modalities. Finally, multi-view images were introduced to break the limitation of leaf occlusion in natural wheat fields by completing the feature space of objects, and the multi-view information fusion method based on the common underlying principle of view agreement was proposed. The experimental results demonstrate that depth information is a valuable complement to RGB imagery, facilitating the detection of grass weeds and significantly enhancing weed detection accuracy in wheat fields. Compared with Deep Convolutional Neural Network (DCNN) model, dual-path Swin Transformer model and the model containing AAM have resulted in an improvement in weed detection accuracy by 6.6% and 11.03%, respectively. Additionally, incorporating multi-view information has effectively addressed the issue of leaf occlusion in weed detection, resulting in an 85.14% increase in weed detection accuracy. Furthermore, the problem of view divergence in multi-view learning has been resolved, reducing the false detection rate by 23.94% compared to the direct Combination method.

引用

页数：17

共 50 条

[1] Multi-modal and multi-view image dataset for weeds detection in wheat field
Xu, Ke
Jiang, Zhijian
Liu, Qihang
Xie, Qi
Zhu, Yan
Cao, Weixing
Ni, Jun
FRONTIERS IN PLANT SCIENCE, 2022, 13
[2] Multi-View and Multi-Modal Action Recognition with Learned Fusion
Ardianto, Sandy
Hang, Hsueh-Ming
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1601 - 1604
[3] MULTI-VIEW AND MULTI-MODAL EVENT DETECTION UTILIZING TRANSFORMER-BASED MULTI-SENSOR FUSION
Yasuda, Masahiro
Ohishi, Yasunori
Saito, Shoichiro
Harado, Noboru
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4638 - 4642
[4] An approach to multi-modal multi-view video coding
Zhang, Yun
Jiang, Gangyi
Yi, Wenjuan
Yu, Mei
Jiang, Zhidi
Kim, Yong Deak
2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1405 - +
[5] MV-BART: Multi-view BART for Multi-modal Sarcasm Detection
Zhuang, Xingjie
Zhou, Fengling
Li, Zhixin
PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2024, 2024, : 3602 - 3611
[6] Multi-view Image Fusion
Comino Trinidad, Marc
Martin Brualla, Ricardo
Kainz, Florian
Kontkanen, Janne
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4100 - 4109
[7] A multi-view approach to multi-modal MRI cluster ensembles
Mendez, Carlos Andres
Summers, Paul
Menegaz, Gloria
MEDICAL IMAGING 2014: IMAGE PROCESSING, 2014, 9034
[8] A MULTI-VIEW APPROACH TO CONSENSUS CLUSTERING IN MULTI-MODAL MRI
Mendez, C. Andres
Menegaz, Gloria
Summers, Paul
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Multi-view Multi-modal Person Authentication from a Single Walking Image Sequence
Muramatsu, Daigo
Iwama, Haruyuki
Makihara, Yasushi
Yagi, Yasushi
2013 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2013,
[10] Automatic Medical Image Report Generation with Multi-view and Multi-modal Attention Mechanism
Yang, Shaokang
Niu, Jianwei
Wu, Jiyan
Liu, Xuefeng
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT III, 2020, 12454 : 687 - 699

← 1 2 3 4 5 →