Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities

被引：111

作者：

Gene-Mola, Jordi ^{[1
]}

Vilaplana, Veronica ^{[2
]}

Rosell-Polo, Joan R. ^{[1
]}

Morros, Josep-Ramon ^{[2
]}

Ruiz-Hidalgo, Javier ^{[2
]}

Gregorio, Eduard ^{[1
]}

机构：

[1] Univ Lleida UdL, Agrotetnio Ctr, Dept Agr & Forest Engn, Res Grp AgroICT & Precis Agr, Lleida, Catalonia, Spain

[2] Univ Politecn Cataluna, Dept Signal Theory & Commun, Barcelona, Catalonia, Spain

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2019年 / 162卷

关键词：

RGB-D; Multi-modal faster R-CNN; Convolutional neural networks; Fruit detection; Agricultural robotics; Fruit reflectance; TERRESTRIAL LASER SCANNER; FRUIT DETECTION; PRECISION AGRICULTURE; STRUCTURED LIGHT; ORCHARD; IMAGES; COLOR; LIDAR; TREE; SENSORS;

D O I：

10.1016/j.compag.2019.05.016

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

Fruit detection and localization will be essential for future agronomic management of fruit crops, with applications in yield prediction, yield mapping and automated harvesting. RGB-D cameras are promising sensors for fruit detection given that they provide geometrical information with color data. Some of these sensors work on the principle of time-of-flight (ToF) and, besides color and depth, provide the backscatter signal intensity. However, this radiometric capability has not been exploited for fruit detection applications. This work presents the KFuji RGB-DS database, composed of 967 multi-modal images containing a total of 12,839 Fuji apples. Compilation of the database allowed a study of the usefulness of fusing RGB-D and radiometric information obtained with Kinect v2 for fruit detection. To do so, the signal intensity was range corrected to overcome signal attenuation, obtaining an image that was proportional to the reflectance of the scene. A registration between RGB, depth and intensity images was then carried out. The Faster R-CNN model was adapted for use with five channel input images: color (RGB), depth (D) and range-corrected intensity signal (S). Results show an improvement of 4.46% in F1-score when adding depth and range-corrected intensity channels, obtaining an F1-score of 0.898 and an AP of 94.8% when all channels are used. From our experimental results, it can be concluded that the radiometric capabilities of ToF sensors give valuable information for fruit detection.

引用

页码：689 / 698

页数：10

共 50 条

[31] RGB-D Salient Object Detection Method Based on Multi-Modal Fusion and Contour Guidance
Peng, Yanbin
Feng, Mingkun
Zheng, Zhijun
IEEE ACCESS, 2023, 11 : 145217 - 145230
[32] RGB-D Image Saliency Detection Based on Multi-modal Feature-fused Supervision
Liu Zhengyi
Duan Quntao
Shi Song
Zhao Peng
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2020, 42 (04) : 997 - 1004
[33] An improved YOLOv7 network using RGB-D multi-modal feature fusion for tea shoots detection
Wu, Yanxu
Chen, Jianneng
Wu, Shunkai
Li, Hui
He, Leiying
Zhao, Runmao
Wu, Chuanyu
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 216
[34] A Multi-Modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling
Asif, Umar
Bennamoun, Mohammed
Sohel, Ferdous A.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (09) : 2051 - 2065
[35] Learning a deeply supervised multi-modal RGB-D embedding for semantic scene and object category recognition
Zaki, Hasan F. M.
Shafait, Faisal
Mian, Ajmal
ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 92 : 41 - 52
[36] MMPL-Net: multi-modal prototype learning for one-shot RGB-D segmentation
Shan, Dexing
Zhang, Yunzhou
Liu, Xiaozheng
Liu, Shitong
Coleman, Sonya A.
Kerr, Dermot
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (14): : 10297 - 10310
[37] MMPL-Net: multi-modal prototype learning for one-shot RGB-D segmentation
Dexing Shan
Yunzhou Zhang
Xiaozheng Liu
Shitong Liu
Sonya A. Coleman
Dermot Kerr
Neural Computing and Applications, 2023, 35 : 10297 - 10310
[38] Hierarchical multi-modal fusion FCN with attention model for RGB-D tracking
Jiang, Ming-xin
Deng, Chao
Shan, Jing-song
Wang, Yuan-yuan
Jia, Yin-jie
Sun, Xing
INFORMATION FUSION, 2019, 50 : 1 - 8
[39] Eulerian Magnification of Multi-Modal RGB-D Video for Heart Rate Estimation
Dosso, Yasmina Souley
Bekele, Amente
Green, James R.
2018 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA), 2018, : 642 - 647
[40] Real-Time Multi-Modal People Detection and Tracking of Mobile Robots with A RGB-D Sensor
Huang, Wenchao
Zhou, Bo
Qian, Kun
Fang, Fang
Ma, Xudong
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2019), 2019, : 325 - 330

← 1 2 3 4 5 →