HTC-Grasp: A Hybrid Transformer-CNN Architecture for Robotic Grasp Detection

被引:5
|
作者
Zhang, Qiang [1 ]
Zhu, Jianwei [1 ]
Sun, Xueying [1 ]
Liu, Mingmin [2 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Automat, 666 Changhui Rd, Zhenjiang 212100, Peoples R China
[2] SIASUN Robot & Automat Co Ltd, Cent Res Inst, 16 Jinhui St, Shenyang 110168, Peoples R China
基金
中国国家自然科学基金;
关键词
robotic grasp; transformer; attentional mechanism;
D O I
10.3390/electronics12061505
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurately detecting suitable grasp areas for unknown objects through visual information remains a challenging task. Drawing inspiration from the success of the Vision Transformer in vision detection, the hybrid Transformer-CNN architecture for robotic grasp detection, known as HTC-Grasp, is developed to improve the accuracy of grasping unknown objects. The architecture employs an external attention-based hierarchical Transformer as an encoder to effectively capture global context and correlation features across the entire dataset. Furthermore, a channel-wise attention-based CNN decoder is presented to adaptively adjust the weight of the channels in the approach, resulting in more efficient feature aggregation. The proposed method is validated on the Cornell and the Jacquard dataset, achieving an image-wise detection accuracy of 98.3% and 95.8% on each dataset, respectively. Additionally, the object-wise detection accuracy of 96.9% and 92.4% on the same datasets are achieved based on this method. A physical experiment is also performed using the Elite 6Dof robot, with a grasping accuracy rate of 93.3%, demonstrating the proposed method's ability to grasp unknown objects in real scenarios. The results of this study indicate that the proposed method outperforms other state-of-the-art methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] scTCA: a hybrid Transformer-CNN architecture for imputation and denoising of scDNA-seq data
    Yu, Zhenhua
    Liu, Furui
    Li, Yang
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [22] Land Cover Classification of UAV Remote Sensing Based on Transformer-CNN Hybrid Architecture
    Lu, Tingyu
    Wan, Luhe
    Qi, Shaoqun
    Gao, Meixiang
    SENSORS, 2023, 23 (11)
  • [23] A Smart Dual-modal Aligned Transformer Deep Network for Robotic Grasp Detection
    Cang, Xin
    Zhang, Haojun
    Yang, Yuequan
    Cao, Zhiqiang
    Li, Fudong
    Zhu, Jiaming
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1230 - 1235
  • [24] FGNet: Faster Robotic Grasp Detection Network
    Cheng, Bangqiang
    Sun, Lei
    2024 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, ICMA 2024, 2024, : 96 - 100
  • [25] Robotic Grasp Detection for Parallel Grippers: A Review
    Yin, Zhiyun
    Li, Yujie
    Cai, Jintong
    Lu, Huimin
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1184 - 1187
  • [26] Efficient Grasp Detection Network With Gaussian-Based Grasp Representation for Robotic Manipulation
    Cao, Hu
    Chen, Guang
    Li, Zhijun
    Feng, Qian
    Lin, Jianjie
    Knoll, Alois
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2023, 28 (03) : 1384 - 1394
  • [27] EGNet: Efficient Robotic Grasp Detection Network
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (04) : 4058 - 4067
  • [28] MCT-Grasp: A Novel Grasp Detection Using Multimodal Embedding and Convolutional Modulation Transformer
    Yang, Guowei
    Jia, Tong
    Liu, Yizhe
    Liu, Zhenghao
    Zhang, Kaibo
    Du, Zhenjun
    IEEE SENSORS JOURNAL, 2024, 24 (23) : 39206 - 39217
  • [29] Anatomical Landmark Detection Using a Multiresolution Learning Approach with a Hybrid Transformer-CNN Model
    Viriyasaranon, Thanaporn
    Ma, Serie
    Choi, Jang-Hwan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 433 - 443
  • [30] A HYBRID GRASP MATRIX FOR COOPERATIVE ROBOTIC OBJECT MANIPULATION
    Ringold, Tyson L.
    Cipra, Raymond J.
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE - 2011, VOL 6, PTS A AND B, 2012, : 807 - 816