Imitation Learning from a Single Demonstration Leveraging Vector Quantization for Robotic Harvesting

被引：0

作者：

Porichis, Antonios ^{[1
,2
]}

Inglezou, Myrto ^{[1
]}

Kegkeroglou, Nikolaos ^{[3
]}

Mohan, Vishwanathan ^{[1
]}

Chatzakos, Panagiotis ^{[1
]}

机构：

[1] Univ Essex, AI Innovat Ctr, Wivenhoe Pk, Colchester CO4 3SQ, England

[2] Natl Struct Integr Res Ctr, Granta Pk, Cambridge CB21 6AL, England

[3] TWI Hellas, 280 Kifisias Ave, Halandri 15232, Greece

来源：

ROBOTICS | 2024年 / 13卷 / 07期

基金：

欧盟地平线“2020”;

关键词：

imitation learning; learning by demonstration; vector quantization; mushroom harvesting; visual servoing;

D O I：

10.3390/robotics13070098

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The ability of robots to tackle complex non-repetitive tasks will be key in bringing a new level of automation in agricultural applications still involving labor-intensive, menial, and physically demanding activities due to high cognitive requirements. Harvesting is one such example as it requires a combination of motions which can generally be broken down into a visual servoing and a manipulation phase, with the latter often being straightforward to pre-program. In this work, we focus on the task of fresh mushroom harvesting which is still conducted manually by human pickers due to its high complexity. A key challenge is to enable harvesting with low-cost hardware and mechanical systems, such as soft grippers which present additional challenges compared to their rigid counterparts. We devise an Imitation Learning model pipeline utilizing Vector Quantization to learn quantized embeddings directly from visual inputs. We test this approach in a realistic environment designed based on recordings of human experts harvesting real mushrooms. Our models can control a cartesian robot with a soft, pneumatically actuated gripper to successfully replicate the mushroom outrooting sequence. We achieve 100% success in picking mushrooms among distractors with less than 20 min of data collection comprising a single expert demonstration and auxiliary, non-expert, trajectories. The entire model pipeline requires less than 40 min of training on a single A4000 GPU and approx. 20 ms for inference on a standard laptop GPU.

引用

页数：18

共 50 条

[31] Learning Partial Ordering Constraints from a Single Demonstration
Mohseni-Kabir, Anahita
Rich, Charles
Chernova, Sonia
HRI'14: PROCEEDINGS OF THE 2014 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2014, : 248 - 249
[32] SingleDemoGrasp: Learning to Grasp From a Single Image Demonstration
Sefat, Amir Mehman
Angleraud, Alexandre
Rahtu, Esa
Pieters, Roel
2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 390 - 396
[33] Towards Learning to Imitate from a Single Video Demonstration
Berseth, Glen
Golemo, Florian
Pal, Christopher
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24 : 1 - 26
[34] Interactive Hierarchical Task Learning from a Single Demonstration
Mohseni-Kabir, Anahita
Rich, Charles
Chernova, Sonia
Sidner, Candace L.
Miller, Daniel
PROCEEDINGS OF THE 2015 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'15), 2015, : 205 - 212
[35] Using Learning from Demonstration (LfD) to perform the complete apple harvesting task
van de Ven, Robert
Shoushtari, Ali Leylavi
Nieuwenhuizen, Ard
Kootstra, Gert
van Henten, Eldert J.
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 224
[36] Deep Adversarial Imitation Learning of Locomotion Skills from One-shot Video Demonstration
Zhang, Huiwen
Liu, Yuwang
Zhou, Weijia
2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1257 - 1261
[37] Insightful stress detection from physiology modalities using Learning Vector Quantization
de Vries, J. J. G.
Pauws, Steffen C.
Biehl, Michael
NEUROCOMPUTING, 2015, 151 : 873 - 882
[38] Vector space architecture for emergent interoperability of systems by learning from demonstration
Emruli, Blerim
Sandin, Fredrik
Delsing, Jerker
BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2014, 9 : 33 - 45
[39] Vector space architecture for emergent interoperability of systems by learning from demonstration
Emruli, Blerim
Sandin, Fredrik
Delsing, Jerker
BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2015, 11 : 53 - 64
[40] Semantic Segmentation for Robotic Apple Harvesting: A Deep Learning Approach Leveraging U-Net, Synthetic Data, and Domain Adaptation
Selvaraj, Ghokulji
Farzan, Siavash
2024 21ST INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS, UR 2024, 2024, : 611 - 618

← 1 2 3 4 5 →