Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities

被引：111

作者：

Gene-Mola, Jordi ^{[1
]}

Vilaplana, Veronica ^{[2
]}

Rosell-Polo, Joan R. ^{[1
]}

Morros, Josep-Ramon ^{[2
]}

Ruiz-Hidalgo, Javier ^{[2
]}

Gregorio, Eduard ^{[1
]}

机构：

[1] Univ Lleida UdL, Agrotetnio Ctr, Dept Agr & Forest Engn, Res Grp AgroICT & Precis Agr, Lleida, Catalonia, Spain

[2] Univ Politecn Cataluna, Dept Signal Theory & Commun, Barcelona, Catalonia, Spain

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2019年 / 162卷

关键词：

RGB-D; Multi-modal faster R-CNN; Convolutional neural networks; Fruit detection; Agricultural robotics; Fruit reflectance; TERRESTRIAL LASER SCANNER; FRUIT DETECTION; PRECISION AGRICULTURE; STRUCTURED LIGHT; ORCHARD; IMAGES; COLOR; LIDAR; TREE; SENSORS;

D O I：

10.1016/j.compag.2019.05.016

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

Fruit detection and localization will be essential for future agronomic management of fruit crops, with applications in yield prediction, yield mapping and automated harvesting. RGB-D cameras are promising sensors for fruit detection given that they provide geometrical information with color data. Some of these sensors work on the principle of time-of-flight (ToF) and, besides color and depth, provide the backscatter signal intensity. However, this radiometric capability has not been exploited for fruit detection applications. This work presents the KFuji RGB-DS database, composed of 967 multi-modal images containing a total of 12,839 Fuji apples. Compilation of the database allowed a study of the usefulness of fusing RGB-D and radiometric information obtained with Kinect v2 for fruit detection. To do so, the signal intensity was range corrected to overcome signal attenuation, obtaining an image that was proportional to the reflectance of the scene. A registration between RGB, depth and intensity images was then carried out. The Faster R-CNN model was adapted for use with five channel input images: color (RGB), depth (D) and range-corrected intensity signal (S). Results show an improvement of 4.46% in F1-score when adding depth and range-corrected intensity channels, obtaining an F1-score of 0.898 and an AP of 94.8% when all channels are used. From our experimental results, it can be concluded that the radiometric capabilities of ToF sensors give valuable information for fruit detection.

引用

页码：689 / 698

页数：10

共 50 条

[41] Progressive Guided Fusion Network With Multi-Modal and Multi-Scale Attention for RGB-D Salient Object Detection
Wu, Jiajia
Han, Guangliang
Wang, Haining
Yang, Hang
Li, Qingqing
Liu, Dongxu
Ye, Fangjian
Liu, Peixun
IEEE ACCESS, 2021, 9 : 150608 - 150622
[42] People detection and tracking using RGB-D cameras for mobile robots
Liu, Hengli
Luo, Jun
Wu, Peng
Xie, Shaorong
Li, Hengyu
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2016, 13 : 1 - 11
[43] Multi-modal interactive attention and dual progressive decoding network for RGB-D/T salient object detection
Liang, Yanhua
Qin, Guihe
Sun, Minghui
Qin, Jun
Yan, Jie
Zhang, Zhonghan
NEUROCOMPUTING, 2022, 490 : 132 - 145
[44] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
Chen, Hao
Li, Youfu
Su, Dan
PATTERN RECOGNITION, 2019, 86 : 376 - 385
[45] MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification
Li, Yabei
Zhang, Zhang
Cheng, Yanhua
Wang, Liang
Tan, Tieniu
PATTERN RECOGNITION, 2019, 90 : 436 - 449
[46] A Systematic Deep Learning Based Overhead Tracking and Counting System Using RGB-D Remote Cameras
Gochoo, Munkhjargal
Rizwan, Syeda Amna
Ghadi, Yazeed Yasin
Jalal, Ahmad
Kim, Kibum
APPLIED SCIENCES-BASEL, 2021, 11 (12):
[47] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
Xiao, Yun
Huang, Yameng
Li, Chenglong
Liu, Lei
Zhou, Aiwu
Tang, Jin
COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
[48] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
Yun Xiao
Yameng Huang
Chenglong Li
Lei Liu
Aiwu Zhou
Jin Tang
Cognitive Computation, 2023, 15 : 1868 - 1883
[49] Multi-Modal Song Mood Detection with Deep Learning
Pyrovolakis, Konstantinos
Tzouveli, Paraskevi
Stamou, Giorgos
SENSORS, 2022, 22 (03)
[50] M 2RNet: Multi-modal and multi-scale refined network for RGB-D salient object detection
Fang, Xian
Jiang, Mingfeng
Zhu, Jinchao
Shao, Xiuli
Wang, Hongpeng
PATTERN RECOGNITION, 2023, 135

← 1 2 3 4 5 →