Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer

被引:74
|
作者
Liu, Hai [1 ]
Zhang, Cheng [1 ]
Deng, Yongjian [2 ]
Liu, Tingting [3 ,4 ]
Zhang, Zhaoli [1 ]
Li, You-Fu [4 ]
机构
[1] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China
[2] Beijing Univ Technol, Coll Comp Sci, Beijing 100124, Peoples R China
[3] Hubei Univ, Sch Educ, Wuhan 430062, Hubei, Peoples R China
[4] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
关键词
Head; Transformers; Visualization; Computer architecture; Pose estimation; Task analysis; Semantics; Head pose estimation; attention mechanism; relationship perception; deep learning; transformer;
D O I
10.1109/TIP.2023.3331309
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Head pose estimation (HPE) is an indispensable upstream task in the fields of human-machine interaction, self-driving, and attention detection. However, practical head pose applications suffer from several challenges, such as severe occlusion, low illumination, and extreme orientations. To address these challenges, we identify three cues from head images, namely, critical minority relationships, neighborhood orientation relationships, and significant facial changes. On the basis of the three cues, two key insights on head poses are revealed: 1) intra-orientation relationship and 2) cross-orientation relationship. To leverage two key insights above, a novel relationship-driven method is proposed based on the Transformer architecture, in which facial and orientation relationships can be learned. Specifically, we design several orientation tokens to explicitly encode basic orientation regions. Besides, a novel token guide multi-loss function is accordingly designed to guide the orientation tokens as they learn the desired regional similarities and relationships. Experimental results on three challenging benchmark HPE datasets show that our proposed TokenHPE achieves state-of-the-art performance. Moreover, qualitative visualizations are provided to verify the effectiveness of the token-learning methodology.
引用
收藏
页码:6289 / 6302
页数:14
相关论文
共 50 条
  • [21] Ego-Body Pose Estimation via Ego-Head Pose Estimation
    Li, Jiaman
    Liu, C. Karen
    Wu, Jiajun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 17142 - 17151
  • [22] A Vector-based Representation to Enhance Head Pose Estimation
    Cao, Zhiwen
    Chu, Zongcheng
    Liu, Dongfang
    Chen, Yingjie
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1187 - 1196
  • [23] On the representation and methodology for wide and short range head pose estimation
    Cobo, Alejandro
    Valle, Roberto
    Buenaposada, Jose M.
    Baumela, Luis
    PATTERN RECOGNITION, 2024, 149
  • [24] 3D hand pose and mesh estimation via a generic Topology-aware Transformer model
    Yu, Shaoqi
    Wang, Yintong
    Chen, Lili
    Zhang, Xiaolin
    Li, Jiamao
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [25] A Joint Estimation of Head and Body Orientation Cues in Surveillance Video
    Chen, Cheng
    Heili, Alexandre
    Odobez, Jean-Marc
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [26] Non-Stationary Representation for Continuity Aware Head Pose Estimation via Deep Neural Decision Trees (vol 7, pg 181947, 2019)
    Wang, Jiang
    Ullah, Farhan
    Cai, Ying
    Li, Jing
    IEEE ACCESS, 2020, 8 : 150225 - 150225
  • [27] A Lightweight Context-Aware Feature Transformer Network for Human Pose Estimation
    Ma, Yanli
    Shi, Qingxuan
    Zhang, Fan
    ELECTRONICS, 2024, 13 (04)
  • [28] SRNet: Structural Relation-aware Network for Head Pose Estimation
    Zeng, Zhaoxiang
    Zhu, Dongchen
    Zhang, Guanghui
    Shi, Wenjun
    Wang, Lei
    Zhang, Xiaolin
    Li, Jiamao
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 826 - 832
  • [29] 6DFLRNet: 6D rotation representation for head pose estimation based on facial landmarks and regression
    Zhao, Na
    Ma, Yaofei
    Li, Xiaopeng
    Lee, Shin-Jye
    Wang, Jian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 68605 - 68624
  • [30] ARHPE: Asymmetric Relation-Aware Representation Learning for Head Pose Estimation in Industrial Human-Computer Interaction
    Liu, Hai
    Liu, Tingting
    Zhang, Zhaoli
    Sangaiah, Arun Kumar
    Yang, Bing
    Li, Youfu
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 7107 - 7117