METACAP: Meta-learning Priors from Multi-view Imagery for Sparse-View Human Performance Capture and Rendering

被引:0
|
作者
Sun, Guoxing [1 ]
Dabral, Rishabh [1 ]
Fua, Pascal [2 ]
Theobalt, Christian [1 ]
Habermann, Marc [1 ]
机构
[1] Max Planck Inst Informat, Saarland Informat Campus, Saarbrucken, Germany
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
关键词
Human Performance Capture; Meta Learning; EFFICIENT;
D O I
10.1007/978-3-031-72952-2_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Faithful human performance capture and free-view rendering from sparse RGB observations is a long-standing problem in Vision and Graphics. The main challenges are the lack of observations and the inherent ambiguities of the setting, e.g. occlusions and depth ambiguity. As a result, radiance fields, which have shown great promise in capturing high-frequency appearance and geometry details in dense setups, perform poorly when naively supervising them on sparse camera views, as the field simply overfits to the sparse-view inputs. To address this, we propose METACAP, a method for efficient and high-quality geometry recovery and novel view synthesis given very sparse or even a single view of the human. Our key idea is to meta-learn the radiance field weights solely from potentially sparse multi-view videos, which can serve as a prior when fine-tuning them on sparse imagery depicting the human. This prior provides a good network weight initialization, thereby effectively addressing ambiguities in sparse-view capture. Due to the articulated structure of the human body and motion-induced surface deformations, learning such a prior is non-trivial. Therefore, we propose to meta-learn the field weights in a pose-canonicalized space, which reduces the spatial feature range and makes feature learning more effective. Consequently, one can fine-tune our field parameters to quickly generalize to unseen poses, novel illumination conditions as well as novel and sparse (even monocular) camera views. For evaluating our method under different scenarios, we collect a new dataset, WILDDYNACAP, which contains subjects captured in, both, a dense camera dome and in-the-wild sparse camera rigs, and demonstrate superior results compared to recent state-of-the-art methods on, both, public and WILDDYNACAP dataset.
引用
收藏
页码:341 / 361
页数:21
相关论文
共 50 条
  • [1] Performance capture from sparse multi-view video
    de Aguiar, Edilson
    Stoll, Carsten
    Theobalt, Christian
    Ahmed, Naveed
    Seidel, Hans-Peter
    Thrun, Sebastian
    ACM TRANSACTIONS ON GRAPHICS, 2008, 27 (03):
  • [2] Collaborative Sparse Priors for Multi-view ATR
    Li, Xuelu
    Monga, Vishal
    AUTOMATIC TARGET RECOGNITION XXVIII, 2018, 10648
  • [3] Learning Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors
    Zhang, Wenyuan
    Shi, Kanle
    Liu, Yu-Shen
    Han, Zhizhong
    COMPUTER VISION - ECCV 2024, PT XLIX, 2025, 15107 : 397 - 415
  • [4] Multi-view Neural Human Rendering
    Wu, Minye
    Wang, Yuehao
    Hu, Qiang
    Yu, Jingyi
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1679 - 1688
  • [5] Deep Volumetric Video From Very Sparse Multi-view Performance Capture
    Huang, Zeng
    Li, Tianye
    Chen, Weikai
    Zhao, Yajie
    Xing, Jun
    LeGendre, Chloe
    Luo, Linjie
    Ma, Chongyang
    Li, Hao
    COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 351 - 369
  • [6] COLLABORATIVE SPARSE PRIORS FOR INFRARED IMAGE MULTI-VIEW ATR
    Li, Xuelu
    Monga, Vishal
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 5736 - 5739
  • [7] Deep Video-Based Performance Synthesis from Sparse Multi-View Capture
    Chen, Mingjia
    Wang, Changbo
    Liu, Ligang
    COMPUTER GRAPHICS FORUM, 2019, 38 (07) : 543 - 554
  • [8] Deep Supervised Multi-View Learning With Graph Priors
    Hu, Peng
    Zhen, Liangli
    Peng, Xi
    Zhu, Hongyuan
    Lin, Jie
    Wang, Xu
    Peng, Dezhong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 123 - 133
  • [9] Multi-View Multi-Instance Learning Based on Joint Sparse Representation and Multi-View Dictionary Learning
    Li, Bing
    Yuan, Chunfeng
    Xiong, Weihua
    Hu, Weiming
    Peng, Houwen
    Ding, Xinmiao
    Maybank, Steve
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2554 - 2560
  • [10] Sparse-View CT Reconstruction via Robust and Multi-channels Autoencoding Priors
    Zhang, Minghui
    Zhang, Fengqin
    Liu, Qiegen
    Liang, Dong
    ISICDM 2018: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON IMAGE COMPUTING AND DIGITAL MEDICINE, 2018, : 55 - 59