DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation

被引:22
|
作者
Zhang, Hongwen [1 ,2 ,3 ]
Cao, Jie [1 ,2 ,3 ]
Lu, Guo [4 ]
Ouyang, Wanli [5 ]
Sun, Zhenan [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, CRIPAC, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[5] Univ Sydney, SenseTime Comp Vis Res Grp, Sydney, NSW, Australia
基金
中国国家自然科学基金;
关键词
Decompose-and-aggregate Network; 3D human shape and pose estimation; position-aided rotation feature refinement;
D O I
10.1145/3343031.3351057
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reconstructing 3D human shape and pose from a monocular image is challenging despite the promising results achieved by most recent learning based methods. The commonly occurred misalignment comes from the facts that the mapping from image to model space is highly non-linear and the rotation-based pose representation of the body model is prone to result in drift of joint positions. In this work, we present the Decompose-and-aggregate Network (DaNet) to address these issues. DaNet includes three new designs, namely UVI guided learning, decomposition for fine-grained perception, and aggregation for robust prediction. First, we adopt the UVI maps, which densely build a bridge between 2D pixels and 3D vertexes, as an intermediate representation to facilitate the learning of image-to-model mapping. Second, we decompose the prediction task into one global stream and multiple local streams so that the network not only provides global perception for the camera and shape prediction, but also has detailed perception for part pose prediction. Lastly, we aggregate the message from local streams to enhance the robustness of part pose prediction, where a position-aided rotation feature refinement strategy is proposed to exploit the spatial relationship between body parts. Such a refinement strategy is more efficient since the correlations between position features are stronger than that in the original rotation feature space. The effectiveness of our method is validated on the Human3.6M and UP-3D datasets. Experimental results show that the proposed method significantly improves the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at https://github.com/HongwenZhang/DaNet-3DHumanReconstrution.
引用
收藏
页码:935 / 944
页数:10
相关论文
共 50 条
  • [1] DANet: dual association network for human pose estimation in video
    Lianping Yang
    Yang Liu
    Haoyue Fu
    Hegui Zhu
    Wuming Jiang
    Multimedia Tools and Applications, 2024, 83 : 40253 - 40267
  • [2] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [3] DANet: dual association network for human pose estimation in video
    Yang, Lianping
    Liu, Yang
    Fu, Haoyue
    Zhu, Hegui
    Jiang, Wuming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 40253 - 40267
  • [4] MANet: Multi-level Attention Network for 3D Human Shape and Pose Estimation
    Yao, Chenhao
    Li, Guiqing
    Zeng, Juncheng
    Nie, Yongwei
    Xian, Chuhua
    ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT I, 2024, 14495 : 476 - 488
  • [5] Multi-initialization Optimization Network for Accurate 3D Human Pose and Shape Estimation
    Liu, Zhiwei
    Zhu, Xiangyu
    Yang, Lu
    Yan, Xiang
    Tang, Ming
    Lei, Zhen
    Zhu, Guibo
    Feng, Xuetao
    Wang, Yan
    Wang, Jinqiao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1976 - 1984
  • [6] Position constrained network for 3D human pose estimation
    Xiena Dong
    Jun Yu
    Jian Zhang
    Multimedia Systems, 2023, 29 : 459 - 468
  • [7] Position constrained network for 3D human pose estimation
    Dong, Xiena
    Yu, Jun
    Zhang, Jian
    MULTIMEDIA SYSTEMS, 2023, 29 (02) : 459 - 468
  • [8] Optimizing Network Structure for 3D Human Pose Estimation
    Ci, Hai
    Wang, Chunyu
    Ma, Xiaoxuan
    Wang, Yizhou
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2262 - 2271
  • [9] Learnable Human Mesh Triangulation for 3D Human Pose and Shape Estimation
    Chun, Sungho
    Park, Sungbum
    Chang, Ju Yong
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2849 - 2858
  • [10] HYRE: Hybrid Regressor for 3D Human Pose and Shape Estimation
    Li, Wenhao
    Liu, Mengyuan
    Liu, Hong
    Ren, Bin
    Li, Xia
    You, Yingxuan
    Sebe, Nicu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 235 - 246