Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities

被引:111
|
作者
Gene-Mola, Jordi [1 ]
Vilaplana, Veronica [2 ]
Rosell-Polo, Joan R. [1 ]
Morros, Josep-Ramon [2 ]
Ruiz-Hidalgo, Javier [2 ]
Gregorio, Eduard [1 ]
机构
[1] Univ Lleida UdL, Agrotetnio Ctr, Dept Agr & Forest Engn, Res Grp AgroICT & Precis Agr, Lleida, Catalonia, Spain
[2] Univ Politecn Cataluna, Dept Signal Theory & Commun, Barcelona, Catalonia, Spain
关键词
RGB-D; Multi-modal faster R-CNN; Convolutional neural networks; Fruit detection; Agricultural robotics; Fruit reflectance; TERRESTRIAL LASER SCANNER; FRUIT DETECTION; PRECISION AGRICULTURE; STRUCTURED LIGHT; ORCHARD; IMAGES; COLOR; LIDAR; TREE; SENSORS;
D O I
10.1016/j.compag.2019.05.016
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Fruit detection and localization will be essential for future agronomic management of fruit crops, with applications in yield prediction, yield mapping and automated harvesting. RGB-D cameras are promising sensors for fruit detection given that they provide geometrical information with color data. Some of these sensors work on the principle of time-of-flight (ToF) and, besides color and depth, provide the backscatter signal intensity. However, this radiometric capability has not been exploited for fruit detection applications. This work presents the KFuji RGB-DS database, composed of 967 multi-modal images containing a total of 12,839 Fuji apples. Compilation of the database allowed a study of the usefulness of fusing RGB-D and radiometric information obtained with Kinect v2 for fruit detection. To do so, the signal intensity was range corrected to overcome signal attenuation, obtaining an image that was proportional to the reflectance of the scene. A registration between RGB, depth and intensity images was then carried out. The Faster R-CNN model was adapted for use with five channel input images: color (RGB), depth (D) and range-corrected intensity signal (S). Results show an improvement of 4.46% in F1-score when adding depth and range-corrected intensity channels, obtaining an F1-score of 0.898 and an AP of 94.8% when all channels are used. From our experimental results, it can be concluded that the radiometric capabilities of ToF sensors give valuable information for fruit detection.
引用
收藏
页码:689 / 698
页数:10
相关论文
共 50 条
  • [1] Multi-modal deep feature learning for RGB-D object detection
    Xu, Xiangyang
    Li, Yuncheng
    Wu, Gangshan
    Luo, Jiebo
    PATTERN RECOGNITION, 2017, 72 : 300 - 313
  • [2] Multi-modal deep learning networks for RGB-D pavement waste detection and recognition
    Li, Yangke
    Zhang, Xinman
    WASTE MANAGEMENT, 2024, 177 : 125 - 134
  • [3] RGB-D BASED MULTI-MODAL DEEP LEARNING FOR FACE IDENTIFICATION
    Lin, Tzu-Ying
    Chiu, Ching-Te
    Tang, Ching-Tung
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1668 - 1672
  • [4] Multi-Modal Deep Learning for Weeds Detection in Wheat Field Based on RGB-D Images
    Xu, Ke
    Zhu, Yan
    Cao, Weixing
    Jiang, Xiaoping
    Jiang, Zhijian
    Li, Shuailong
    Ni, Jun
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [5] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    AlDahoul, Nouar
    Karim, Hezerul Abdul
    Momo, Mhd Adel
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [6] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    Nouar AlDahoul
    Hezerul Abdul Karim
    Mhd Adel Momo
    Scientific Reports, 12
  • [7] Multi-modal deep network for RGB-D segmentation of clothes
    Joukovsky, B.
    Hu, P.
    Munteanu, A.
    ELECTRONICS LETTERS, 2020, 56 (09) : 432 - 434
  • [8] MULTI-MODAL TRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
    Song, Peipei
    Zhang, Jing
    Koniusz, Piotr
    Barnes, Nick
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2466 - 2470
  • [9] Multi-modal uniform deep learning for RGB-D person re-identification
    Ren, Liangliang
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    PATTERN RECOGNITION, 2017, 72 : 446 - 457
  • [10] Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition
    Wang, Anran
    Lu, Jiwen
    Cai, Jianfei
    Cham, Tat-Jen
    Wang, Gang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1887 - 1898