HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding

被引:0
|
作者
Zheng, Hao [1 ]
Lee, Regina [1 ]
Lu, Yuqian [1 ]
机构
[1] Univ Auckland, Dept Mech & Mechatron Engn, Auckland, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding comprehensive assembly knowledge from videos is critical for futuristic ultra-intelligent industry. To enable technological breakthrough, we present HA-ViD - an assembly video dataset that features representative industrial assembly scenarios, natural procedural knowledge acquisition process, and consistent human-robot shared annotations. Specifically, HA-ViD captures diverse collaboration patterns of real-world assembly, natural human behaviors and learning progression during assembly, and granulate action annotations to subject, action verb, manipulated object, target object, and tool. We provide 3222 multi-view and multi-modality videos, 1.5M frames, 96K temporal labels and 2M spatial labels. We benchmark four foundational video understanding tasks: action recognition, action segmentation, object detection and multi-object tracking. Importantly, we analyze their performance and the further reasoning steps for comprehending knowledge in assembly progress, process efficiency, task collaboration, skill parameters and human intention. Details of HA-ViD is available at: https://iai-hrc.github.io/ha- vid.
引用
收藏
页数:13
相关论文
共 31 条
  • [31] Classification and monomer-by-monomer annotation dataset of suprachromosomal family 1 alpha satellite higher-order repeats in hg38 human genome assembly
    Uralsky, L. I.
    Shepelev, V. A.
    Alexandrov, A. A.
    Yurov, Y. B.
    Rogaev, E. I.
    Alexandrov, I. A.
    DATA IN BRIEF, 2019, 24