HOIST-Former: Hand-held Objects Identification, Segmentation, and Tracking in the Wild

被引:0
|
作者
Narasimhaswamy, Supreeth [1 ]
Nguyen, Huy Anh [1 ]
Huang, Lihan [1 ]
Hoai, Minh [1 ,2 ]
机构
[1] SUNY Stony Brook, Stony Brook, NY 11794 USA
[2] VinAI Res, Hanoi, Vietnam
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52733.2024.00228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the challenging task of identifying, segmenting, and tracking hand-held objects, which is crucial for applications such as human action segmentation and performance evaluation. This task is particularly challenging due to heavy occlusion, rapid motion, and the transitory nature of objects being hand-held, where an object may be held, released, and subsequently picked up again. To tackle these challenges, we have developed a novel transformer-based architecture called HOIST-Former. HOIST-Former is adept at spatially and temporally segmenting hands and objects by iteratively pooling features from each other, ensuring that the processes of identification, segmentation, and tracking of hand-held objects depend on the hands' positions and their contextual appearance. We further refine HOIST-Former with a contact loss that focuses on areas where hands are in contact with objects. Moreover, we also contribute an in-the-wild video dataset called HOIST, which comprises 4,125 videos complete with bounding boxes, segmentation masks, and tracking IDs for hand-held objects. Through experiments on the HOIST dataset and two additional public datasets, we demonstrate the efficacy of HOIST-Former in segmenting and tracking hand-held objects. Project page: https://supreethn.github.io/research/hoistformer/index.html
引用
收藏
页码:2351 / 2361
页数:11
相关论文
共 50 条
  • [1] Tracking pharmacist interventions with a hand-held computer
    Reilly, JC
    Wallace, M
    Campbell, M
    AMERICAN JOURNAL OF HEALTH-SYSTEM PHARMACY, 2001, 58 (02) : 158 - 161
  • [2] Reconstructing Hand-Held Objects from Monocular Video
    Huang, Di
    Ji, Xiaopeng
    He, Xingyi
    Sun, Jiaming
    He, Tong
    Shuai, Qing
    Ouyang, Wanli
    Zhou, Xiaowei
    PROCEEDINGS SIGGRAPH ASIA 2022, 2022,
  • [3] HAPTICALLY PERCEIVING THE DISTANCES REACHABLE WITH HAND-HELD OBJECTS
    SOLOMON, HY
    TURVEY, MT
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1988, 14 (03) : 404 - 427
  • [4] A Hand-Held Optoelectronic Nose for the Identification of Liquors
    Li, Zheng
    Suslick, Kenneth S.
    ACS SENSORS, 2018, 3 (01): : 121 - 127
  • [5] Hand-Held Radar for People Tracking in Indoor Scenarios
    Paolini, G.
    Masotti, D.
    Costanzo, A.
    2018 2ND URSI ATLANTIC RADIO SCIENCE MEETING (AT-RASC), 2018,
  • [6] A novel iris segmentation method for hand-held capture device
    He, XF
    Shi, PF
    ADVANCES IN BIOMETRICS, PROCEEDINGS, 2006, 3832 : 479 - 485
  • [7] Crystal Palace: Merging Virtual Objects and Physical Hand-held Tools
    Kashiwagi, Toshiro
    Sumi, Kaoru
    Fels, Sidney
    Zhou, Qian
    Wu, Fan
    2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), 2019, : 1411 - 1412
  • [8] Tracking locations of moving hand-held displays using projected light
    Summet, J
    Sukthankar, R
    PERVASIVE COMPUTING, PROCEEDINGS, 2005, 3468 : 37 - 46
  • [9] Probability of identification of small hand-held objects for electro-optic forward-looking infrared systems
    Moyer, Steve
    Hixson, Jonathan G.
    Edwards, Timothy C.
    Krapels, Keith
    OPTICAL ENGINEERING, 2006, 45 (06)
  • [10] Occupational risk identification using hand-held or laptop computers
    Naumanen, Paula
    Savolainen, Heikki
    Liesivuori, Jyrki
    INTERNATIONAL JOURNAL OF OCCUPATIONAL SAFETY AND ERGONOMICS, 2008, 14 (02) : 207 - 215