HTC-Grasp: A Hybrid Transformer-CNN Architecture for Robotic Grasp Detection

被引：5

作者：

Zhang, Qiang ^{[1
]}

Zhu, Jianwei ^{[1
]}

Sun, Xueying ^{[1
]}

Liu, Mingmin ^{[2
]}

机构：

[1] Jiangsu Univ Sci & Technol, Sch Automat, 666 Changhui Rd, Zhenjiang 212100, Peoples R China

[2] SIASUN Robot & Automat Co Ltd, Cent Res Inst, 16 Jinhui St, Shenyang 110168, Peoples R China

来源：

ELECTRONICS | 2023年 / 12卷 / 06期

基金：

中国国家自然科学基金;

关键词：

robotic grasp; transformer; attentional mechanism;

D O I：

10.3390/electronics12061505

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Accurately detecting suitable grasp areas for unknown objects through visual information remains a challenging task. Drawing inspiration from the success of the Vision Transformer in vision detection, the hybrid Transformer-CNN architecture for robotic grasp detection, known as HTC-Grasp, is developed to improve the accuracy of grasping unknown objects. The architecture employs an external attention-based hierarchical Transformer as an encoder to effectively capture global context and correlation features across the entire dataset. Furthermore, a channel-wise attention-based CNN decoder is presented to adaptively adjust the weight of the channels in the approach, resulting in more efficient feature aggregation. The proposed method is validated on the Cornell and the Jacquard dataset, achieving an image-wise detection accuracy of 98.3% and 95.8% on each dataset, respectively. Additionally, the object-wise detection accuracy of 96.9% and 92.4% on the same datasets are achieved based on this method. A physical experiment is also performed using the Elite 6Dof robot, with a grasping accuracy rate of 93.3%, demonstrating the proposed method's ability to grasp unknown objects in real scenarios. The results of this study indicate that the proposed method outperforms other state-of-the-art methods.

引用

页数：16

共 50 条

[21] scTCA: a hybrid Transformer-CNN architecture for imputation and denoising of scDNA-seq data
Yu, Zhenhua
Liu, Furui
Li, Yang
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
[22] Land Cover Classification of UAV Remote Sensing Based on Transformer-CNN Hybrid Architecture
Lu, Tingyu
Wan, Luhe
Qi, Shaoqun
Gao, Meixiang
SENSORS, 2023, 23 (11)
[23] A Smart Dual-modal Aligned Transformer Deep Network for Robotic Grasp Detection
Cang, Xin
Zhang, Haojun
Yang, Yuequan
Cao, Zhiqiang
Li, Fudong
Zhu, Jiaming
2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1230 - 1235
[24] FGNet: Faster Robotic Grasp Detection Network
Cheng, Bangqiang
Sun, Lei
2024 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, ICMA 2024, 2024, : 96 - 100
[25] Robotic Grasp Detection for Parallel Grippers: A Review
Yin, Zhiyun
Li, Yujie
Cai, Jintong
Lu, Huimin
2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1184 - 1187
[26] Efficient Grasp Detection Network With Gaussian-Based Grasp Representation for Robotic Manipulation
Cao, Hu
Chen, Guang
Li, Zhijun
Feng, Qian
Lin, Jianjie
Knoll, Alois
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2023, 28 (03) : 1384 - 1394
[27] EGNet: Efficient Robotic Grasp Detection Network
Yu, Sheng
Zhai, Di-Hua
Xia, Yuanqing
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (04) : 4058 - 4067
[28] MCT-Grasp: A Novel Grasp Detection Using Multimodal Embedding and Convolutional Modulation Transformer
Yang, Guowei
Jia, Tong
Liu, Yizhe
Liu, Zhenghao
Zhang, Kaibo
Du, Zhenjun
IEEE SENSORS JOURNAL, 2024, 24 (23) : 39206 - 39217
[29] Anatomical Landmark Detection Using a Multiresolution Learning Approach with a Hybrid Transformer-CNN Model
Viriyasaranon, Thanaporn
Ma, Serie
Choi, Jang-Hwan
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 433 - 443
[30] A HYBRID GRASP MATRIX FOR COOPERATIVE ROBOTIC OBJECT MANIPULATION
Ringold, Tyson L.
Cipra, Raymond J.
PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE - 2011, VOL 6, PTS A AND B, 2012, : 807 - 816

← 1 2 3 4 5 →