Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

被引：2

作者：

Lu, Jinguo ^{[1
]}

Ren, Weihong ^{[1
,2
]}

Jiang, Weibo ^{[1
]}

Chen, Xi'ai ^{[2
,3
]}

Wang, Qiang ^{[4
,5
]}

Han, Zhi ^{[2
,3
]}

Liu, Honghai ^{[1
]}

机构：

[1] Harbin Inst Technol, Shenzhen, Peoples R China

[2] Shenyang Univ, Shenyang, Peoples R China

[3] Chinese Acad Sci, Shenyang Inst Automat, State Key Lab Robot, Shenyang, Peoples R China

[4] Chinese Acad Sci, Inst Robot, Beijing, Peoples R China

[5] Chinese Acad Sci, Inst Intelligent Mfg, Beijing, Peoples R China

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52733.2024.02665

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, Vision-Language Model (VLM) has greatly advanced the Human-Object Interaction (HOI) detection. The existing VLM-based HOI detectors typically adopt a handcrafted template (e.g., a photo of a person [action] a/an [object]) to acquire text knowledge through the VLM text encoder. However, such approaches, only encoding the action-specific text prompts in vocabulary level, may suffer from learning ambiguity without exploring the fine-grained clues from the perspective of interaction context. In this paper, we propose a novel method to discover Syntactic Interaction Clues for HOI detection (SICHOI) by using VLM. Specifically, we first investigate what are the essential elements for an interaction context, and then establish a syntactic interaction bank from three levels: spatial relationship, action-oriented posture and situational condition. Further, to align visual features with the syntactic interaction bank, we adopt a multi-view extractor to jointly aggregate visual features from instance, interaction, and image levels accordingly. In addition, we also introduce a dual cross-attention decoder to perform context propagation between text knowledge and visual features, thereby enhancing the HOI detection. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on HICO-DET and V-COCO.

引用

页码：28212 / 28222

页数：11

共 50 条

[41] Pose graph parsing network for human-object interaction detection
Su, Zhan
Wang, Yuting
Xie, Qing
Yu, Ruiyun
NEUROCOMPUTING, 2022, 476 : 53 - 62
[42] Rethinking vision transformer through human-object interaction detection
Cheng, Yamin
Zhao, Zitian
Wang, Zhi
Duan, Hancong
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
[43] Effective actor-centric human-object interaction detection
Xu, Kunlun
Li, Zhimin
Zhang, Zhijun
Dong, Leizhen
Xu, Wenhui
Yan, Luxin
Zhong, Sheng
Zou, Xu
IMAGE AND VISION COMPUTING, 2022, 121
[44] Egocentric Human-Object Interaction Detection Exploiting Synthetic Data
Leonardi, Rosario
Ragusa, Francesco
Furnari, Antonino
Farinella, Giovanni Maria
IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 237 - 248
[45] Knowledge guided relation enhancement for human-object interaction detection
Su, Rui
Gao, Yongbin
Yu, Wenjun
Wu, Chenmou
Jiang, Xiaoyan
Zhou, Shubo
APPLIED INTELLIGENCE, 2025, 55 (06)
[46] Mask-Guided Transformer for Human-Object Interaction Detection
Ying, Daocheng
Yang, Hua
Sun, Jun
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[47] Cascaded Human-Object Interaction Recognition
Zhou, Tianfei
Wang, Wenguan
Qi, Siyuan
Ling, Haibin
Shen, Jianbing
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4262 - 4271
[48] Human-Centric Parsing Network for Human-Object Interaction Detection
Chen, Guanyu
Chen, Chong
Zhao, Zhicheng
Su, Fei
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5488 - 5494
[49] Diagnosing Human-Object Interaction Detectors
Zhu, Fangrui
Xie, Yiming
Xie, Weidi
Jiang, Huaizu
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, : 2227 - 2244
[50] iCGPN: Interaction-centric graph parsing network for human-object interaction detection
Yang, Wenhao
Chen, Guanyu
Zhao, Zhicheng
Su, Fei
Meng, Hongying
NEUROCOMPUTING, 2022, 502 : 98 - 109

← 1 2 3 4 5 →