Imitation Learning from a Single Demonstration Leveraging Vector Quantization for Robotic Harvesting

被引:0
|
作者
Porichis, Antonios [1 ,2 ]
Inglezou, Myrto [1 ]
Kegkeroglou, Nikolaos [3 ]
Mohan, Vishwanathan [1 ]
Chatzakos, Panagiotis [1 ]
机构
[1] Univ Essex, AI Innovat Ctr, Wivenhoe Pk, Colchester CO4 3SQ, England
[2] Natl Struct Integr Res Ctr, Granta Pk, Cambridge CB21 6AL, England
[3] TWI Hellas, 280 Kifisias Ave, Halandri 15232, Greece
基金
欧盟地平线“2020”;
关键词
imitation learning; learning by demonstration; vector quantization; mushroom harvesting; visual servoing;
D O I
10.3390/robotics13070098
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The ability of robots to tackle complex non-repetitive tasks will be key in bringing a new level of automation in agricultural applications still involving labor-intensive, menial, and physically demanding activities due to high cognitive requirements. Harvesting is one such example as it requires a combination of motions which can generally be broken down into a visual servoing and a manipulation phase, with the latter often being straightforward to pre-program. In this work, we focus on the task of fresh mushroom harvesting which is still conducted manually by human pickers due to its high complexity. A key challenge is to enable harvesting with low-cost hardware and mechanical systems, such as soft grippers which present additional challenges compared to their rigid counterparts. We devise an Imitation Learning model pipeline utilizing Vector Quantization to learn quantized embeddings directly from visual inputs. We test this approach in a realistic environment designed based on recordings of human experts harvesting real mushrooms. Our models can control a cartesian robot with a soft, pneumatically actuated gripper to successfully replicate the mushroom outrooting sequence. We achieve 100% success in picking mushrooms among distractors with less than 20 min of data collection comprising a single expert demonstration and auxiliary, non-expert, trajectories. The entire model pipeline requires less than 40 min of training on a single A4000 GPU and approx. 20 ms for inference on a standard laptop GPU.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Learning Partial Ordering Constraints from a Single Demonstration
    Mohseni-Kabir, Anahita
    Rich, Charles
    Chernova, Sonia
    HRI'14: PROCEEDINGS OF THE 2014 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2014, : 248 - 249
  • [32] SingleDemoGrasp: Learning to Grasp From a Single Image Demonstration
    Sefat, Amir Mehman
    Angleraud, Alexandre
    Rahtu, Esa
    Pieters, Roel
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 390 - 396
  • [33] Towards Learning to Imitate from a Single Video Demonstration
    Berseth, Glen
    Golemo, Florian
    Pal, Christopher
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24 : 1 - 26
  • [34] Interactive Hierarchical Task Learning from a Single Demonstration
    Mohseni-Kabir, Anahita
    Rich, Charles
    Chernova, Sonia
    Sidner, Candace L.
    Miller, Daniel
    PROCEEDINGS OF THE 2015 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'15), 2015, : 205 - 212
  • [35] Using Learning from Demonstration (LfD) to perform the complete apple harvesting task
    van de Ven, Robert
    Shoushtari, Ali Leylavi
    Nieuwenhuizen, Ard
    Kootstra, Gert
    van Henten, Eldert J.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 224
  • [36] Deep Adversarial Imitation Learning of Locomotion Skills from One-shot Video Demonstration
    Zhang, Huiwen
    Liu, Yuwang
    Zhou, Weijia
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1257 - 1261
  • [37] Insightful stress detection from physiology modalities using Learning Vector Quantization
    de Vries, J. J. G.
    Pauws, Steffen C.
    Biehl, Michael
    NEUROCOMPUTING, 2015, 151 : 873 - 882
  • [38] Vector space architecture for emergent interoperability of systems by learning from demonstration
    Emruli, Blerim
    Sandin, Fredrik
    Delsing, Jerker
    BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2014, 9 : 33 - 45
  • [39] Vector space architecture for emergent interoperability of systems by learning from demonstration
    Emruli, Blerim
    Sandin, Fredrik
    Delsing, Jerker
    BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2015, 11 : 53 - 64
  • [40] Semantic Segmentation for Robotic Apple Harvesting: A Deep Learning Approach Leveraging U-Net, Synthetic Data, and Domain Adaptation
    Selvaraj, Ghokulji
    Farzan, Siavash
    2024 21ST INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS, UR 2024, 2024, : 611 - 618