HOIST-Former: Hand-held Objects Identification, Segmentation, and Tracking in the Wild

被引:0
|
作者
Narasimhaswamy, Supreeth [1 ]
Nguyen, Huy Anh [1 ]
Huang, Lihan [1 ]
Hoai, Minh [1 ,2 ]
机构
[1] SUNY Stony Brook, Stony Brook, NY 11794 USA
[2] VinAI Res, Hanoi, Vietnam
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52733.2024.00228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the challenging task of identifying, segmenting, and tracking hand-held objects, which is crucial for applications such as human action segmentation and performance evaluation. This task is particularly challenging due to heavy occlusion, rapid motion, and the transitory nature of objects being hand-held, where an object may be held, released, and subsequently picked up again. To tackle these challenges, we have developed a novel transformer-based architecture called HOIST-Former. HOIST-Former is adept at spatially and temporally segmenting hands and objects by iteratively pooling features from each other, ensuring that the processes of identification, segmentation, and tracking of hand-held objects depend on the hands' positions and their contextual appearance. We further refine HOIST-Former with a contact loss that focuses on areas where hands are in contact with objects. Moreover, we also contribute an in-the-wild video dataset called HOIST, which comprises 4,125 videos complete with bounding boxes, segmentation masks, and tracking IDs for hand-held objects. Through experiments on the HOIST dataset and two additional public datasets, we demonstrate the efficacy of HOIST-Former in segmenting and tracking hand-held objects. Project page: https://supreethn.github.io/research/hoistformer/index.html
引用
收藏
页码:2351 / 2361
页数:11
相关论文
共 50 条
  • [31] Hand-held 3D scanner without sensor pose tracking or surface markers
    Kofman, J.
    Borribanbunpotkat, K.
    HIGH VALUE MANUFACTURING: ADVANCED RESEARCH IN VIRTUAL AND RAPID PROTOTYPING, 2014, : 429 - 434
  • [32] Finger Tracking for Hand-held Device Interface Using Profile-matching Stereo Vision
    Chang, Yung-Ping
    Lee, Dah-Jye
    Moore, Jason
    Desai, Alok
    Tippetts, Beau
    INTELLIGENT ROBOTS AND COMPUTER VISION XXX: ALGORITHMS AND TECHNIQUES, 2013, 8662
  • [33] The diagnostic accuracy of the hand-held Raman spectrometer for the identification of anti-malarial drugs
    Visser, Benjamin J.
    de Vries, Sophia G.
    Bache, Emmanuel B.
    Meerveld-Gerrits, Janneke
    Kroon, Danielle
    Boersma, Jimmy
    Agnandji, Selidji T.
    van Vugt, Michele
    Grobusch, Martin P.
    MALARIA JOURNAL, 2016, 15
  • [34] Attitude Tracking Using an Integrated Inertial and Optical Navigation System for Hand-Held Surgical Instruments
    Oh, Hyun-Min
    Kim, Min-Young
    2014 14TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2014), 2014, : 290 - 293
  • [35] Detection and identification of canker and blight on orange trees using a hand-held Raman spectrometer
    Sanchez, Lee
    Pant, Shankar
    Irey, Mike
    Mandadi, Kranthi
    Kurouski, Dmitry
    JOURNAL OF RAMAN SPECTROSCOPY, 2019, 50 (12) : 1875 - 1880
  • [36] The diagnostic accuracy of the hand-held Raman spectrometer for the identification of anti-malarial drugs
    Benjamin J. Visser
    Sophia G. de Vries
    Emmanuel B. Bache
    Janneke Meerveld-Gerrits
    Daniëlle Kroon
    Jimmy Boersma
    Selidji T. Agnandji
    Michèle van Vugt
    Martin P. Grobusch
    Malaria Journal, 15
  • [37] On the identification of the effect of prohibiting hand-held cell phone use while driving: Comment
    Sampaio, Breno
    TRANSPORTATION RESEARCH PART A-POLICY AND PRACTICE, 2010, 44 (09) : 766 - 770
  • [38] Automatic Identification of Hand-Held Vibrating Tools Through Commercial Smartwatches and Machine Learning
    Sigcha, Luis
    Pavon, Ignacio
    Nisi, Stefania
    de Arcas, Guillermo
    OCCUPATIONAL AND ENVIRONMENTAL SAFETY AND HEALTH II, 2020, 277 : 481 - 489
  • [39] Hand-held photoionization instruments for quantitative detection of sarin vapor and for rapid qualitative screening of contaminated objects
    Smith, Philip A.
    Lepage, Carmela Jackson
    Harrer, Kristin L.
    Brochu, Paul J.
    JOURNAL OF OCCUPATIONAL AND ENVIRONMENTAL HYGIENE, 2007, 4 (10) : 729 - 738
  • [40] Quick Viewpoint Switching for Manipulating Virtual Objects in Hand-Held Augmented Reality using Stored Snapshots
    Sukan, Mengu
    Feiner, Steven
    Tversky, Barbara
    Energin, Semih
    2012 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR) - SCIENCE AND TECHNOLOGY, 2012, : 217 - 226