Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities

被引:111
|
作者
Gene-Mola, Jordi [1 ]
Vilaplana, Veronica [2 ]
Rosell-Polo, Joan R. [1 ]
Morros, Josep-Ramon [2 ]
Ruiz-Hidalgo, Javier [2 ]
Gregorio, Eduard [1 ]
机构
[1] Univ Lleida UdL, Agrotetnio Ctr, Dept Agr & Forest Engn, Res Grp AgroICT & Precis Agr, Lleida, Catalonia, Spain
[2] Univ Politecn Cataluna, Dept Signal Theory & Commun, Barcelona, Catalonia, Spain
关键词
RGB-D; Multi-modal faster R-CNN; Convolutional neural networks; Fruit detection; Agricultural robotics; Fruit reflectance; TERRESTRIAL LASER SCANNER; FRUIT DETECTION; PRECISION AGRICULTURE; STRUCTURED LIGHT; ORCHARD; IMAGES; COLOR; LIDAR; TREE; SENSORS;
D O I
10.1016/j.compag.2019.05.016
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Fruit detection and localization will be essential for future agronomic management of fruit crops, with applications in yield prediction, yield mapping and automated harvesting. RGB-D cameras are promising sensors for fruit detection given that they provide geometrical information with color data. Some of these sensors work on the principle of time-of-flight (ToF) and, besides color and depth, provide the backscatter signal intensity. However, this radiometric capability has not been exploited for fruit detection applications. This work presents the KFuji RGB-DS database, composed of 967 multi-modal images containing a total of 12,839 Fuji apples. Compilation of the database allowed a study of the usefulness of fusing RGB-D and radiometric information obtained with Kinect v2 for fruit detection. To do so, the signal intensity was range corrected to overcome signal attenuation, obtaining an image that was proportional to the reflectance of the scene. A registration between RGB, depth and intensity images was then carried out. The Faster R-CNN model was adapted for use with five channel input images: color (RGB), depth (D) and range-corrected intensity signal (S). Results show an improvement of 4.46% in F1-score when adding depth and range-corrected intensity channels, obtaining an F1-score of 0.898 and an AP of 94.8% when all channels are used. From our experimental results, it can be concluded that the radiometric capabilities of ToF sensors give valuable information for fruit detection.
引用
收藏
页码:689 / 698
页数:10
相关论文
共 50 条
  • [41] Progressive Guided Fusion Network With Multi-Modal and Multi-Scale Attention for RGB-D Salient Object Detection
    Wu, Jiajia
    Han, Guangliang
    Wang, Haining
    Yang, Hang
    Li, Qingqing
    Liu, Dongxu
    Ye, Fangjian
    Liu, Peixun
    IEEE ACCESS, 2021, 9 : 150608 - 150622
  • [42] People detection and tracking using RGB-D cameras for mobile robots
    Liu, Hengli
    Luo, Jun
    Wu, Peng
    Xie, Shaorong
    Li, Hengyu
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2016, 13 : 1 - 11
  • [43] Multi-modal interactive attention and dual progressive decoding network for RGB-D/T salient object detection
    Liang, Yanhua
    Qin, Guihe
    Sun, Minghui
    Qin, Jun
    Yan, Jie
    Zhang, Zhonghan
    NEUROCOMPUTING, 2022, 490 : 132 - 145
  • [44] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
    Chen, Hao
    Li, Youfu
    Su, Dan
    PATTERN RECOGNITION, 2019, 86 : 376 - 385
  • [45] MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification
    Li, Yabei
    Zhang, Zhang
    Cheng, Yanhua
    Wang, Liang
    Tan, Tieniu
    PATTERN RECOGNITION, 2019, 90 : 436 - 449
  • [46] A Systematic Deep Learning Based Overhead Tracking and Counting System Using RGB-D Remote Cameras
    Gochoo, Munkhjargal
    Rizwan, Syeda Amna
    Ghadi, Yazeed Yasin
    Jalal, Ahmad
    Kim, Kibum
    APPLIED SCIENCES-BASEL, 2021, 11 (12):
  • [47] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
    Xiao, Yun
    Huang, Yameng
    Li, Chenglong
    Liu, Lei
    Zhou, Aiwu
    Tang, Jin
    COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
  • [48] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
    Yun Xiao
    Yameng Huang
    Chenglong Li
    Lei Liu
    Aiwu Zhou
    Jin Tang
    Cognitive Computation, 2023, 15 : 1868 - 1883
  • [49] Multi-Modal Song Mood Detection with Deep Learning
    Pyrovolakis, Konstantinos
    Tzouveli, Paraskevi
    Stamou, Giorgos
    SENSORS, 2022, 22 (03)
  • [50] M 2RNet: Multi-modal and multi-scale refined network for RGB-D salient object detection
    Fang, Xian
    Jiang, Mingfeng
    Zhu, Jinchao
    Shao, Xiuli
    Wang, Hongpeng
    PATTERN RECOGNITION, 2023, 135