Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments

被引:5
|
作者
Xu, Chi [1 ,2 ,3 ]
Zhou, Jun [1 ,2 ]
Cai, Wendi [1 ,2 ]
Jiang, Yunkai [1 ,2 ]
Li, Yongbo [1 ,2 ]
Liu, Yi [4 ,5 ]
机构
[1] China Univ Geosci, Sch Automat, Wuhan 430074, Peoples R China
[2] Hubei Key Lab Adv Control & Intelligent Automat C, Wuhan 430074, Peoples R China
[3] Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, Wuhan 430074, Peoples R China
[4] CRRC Zhuzhou Elect Locomot Co Ltd, Zhuzhou 412000, Peoples R China
[5] Natl Innovat Ctr Adv Rail Transit Equipment, Zhuzhou 412000, Peoples R China
基金
中国国家自然科学基金;
关键词
3D hand detection; RGB-D sensor; human– computer interaction; unseen lighting condition; adaptive RGB-D fusion; OBJECT DETECTION; RECOGNITION; NETWORK;
D O I
10.3390/s20216360
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light changes. To tackle this problem, we propose a 3D hand detection approach which improves the robustness and accuracy by adaptively fusing the complementary features extracted from the RGB-D channels. Using the fused RGB-D feature, the 2D bounding boxes of hands are detected first, and then the 3D locations along the z-axis are estimated through a cascaded network. Furthermore, we represent a challenging RGB-D hand detection dataset collected in unconstrained environments. Different from previous works which primarily rely on either the RGB or D channel, we adaptively fuse the RGB-D channels for hand detection. Specifically, evaluation results show that the D-channel is crucial for hand detection in unconstrained environments. Our RGB-D fusion-based approach significantly improves the hand detection accuracy from 69.1 to 74.1 comparing to one of the most state-of-the-art RGB-based hand detectors. The existing RGB- or D-based methods are unstable in unseen lighting conditions: in dark conditions, the accuracy of the RGB-based method significantly drops to 48.9, and in back-light conditions, the accuracy of the D-based method dramatically drops to 28.3. Compared with these methods, our RGB-D fusion based approach is much more robust without accuracy degrading, and our detection results are 62.5 and 65.9, respectively, in these two extreme lighting conditions for accuracy.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 50 条
  • [1] RGB-D image saliency detection from 3D perspective
    Liu, Zhengyi
    Song, Tengfei
    Xie, Feng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (06) : 6787 - 6804
  • [2] RGB-D image saliency detection from 3D perspective
    Zhengyi Liu
    Tengfei Song
    Feng Xie
    Multimedia Tools and Applications, 2019, 78 : 6787 - 6804
  • [3] 3D Body Registration from RGB-D Data with Unconstrained Movements and Single Sensor
    Villena-Martinez, Victor
    Fuster-Guillo, Andres
    Saval-Calvo, Marcelo
    Azorin-Lopez, Jorge
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2017, PT II, 2017, 10306 : 317 - 329
  • [4] 3D Hand Pose Detection in Egocentric RGB-D Images
    Rogez, Gregory
    Khademi, Maryam
    Supancic, J. S., III
    Montiel, J. M. M.
    Ramanan, Deva
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 356 - 371
  • [5] Hand Pose Estimation from a Single RGB-D Image
    Kuznetsova, Alina
    Rosenhahn, Bodo
    ADVANCES IN VISUAL COMPUTING, PT II, 2013, 8034 : 592 - 602
  • [6] Joint 3D Object and Layout Inference from a Single RGB-D Image
    Geiger, Andreas
    Wang, Chaohui
    PATTERN RECOGNITION, GCPR 2015, 2015, 9358 : 183 - 195
  • [7] Robust 3D Reconstruction With an RGB-D Camera
    Wang, Kangkan
    Zhang, Guofeng
    Bao, Hujun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (11) : 4893 - 4906
  • [8] Occlusion-Robust 3D Hand Pose Estimation from a Single RGB Image
    Ishii, Asuka
    Nakano, Gaku
    Inoshita, Tetsuo
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [9] Robust Manhattan Frame Estimation from a Single RGB-D Image
    Ghanem, Bernard
    Thabet, Ali
    Niebles, Juan Carlos
    Heilbron, Fabian Caba
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3772 - 3780
  • [10] 3D Reconstruction of Indoor Scenes using a Single RGB-D Image
    Bokaris, Panagiotis-Alexandros
    Muselet, Damien
    Tremeau, Alain
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 6, 2017, : 394 - 401