GANtlitz: Ultra High Resolution Generative Model for Multi-Modal Face Textures

被引:0
|
作者
Gruber, A. [1 ,2 ]
Collins, E. [2 ]
Meka, A. [2 ]
Mueller, F. [2 ]
Sarkar, K. [2 ]
Orts-Escolano, S. [2 ]
Prasso, L. [2 ]
Busch, J. [2 ]
Gross, M. [1 ]
Beeler, T. [2 ]
机构
[1] ETH, Zurich, Switzerland
[2] Google, Menlo Pk, CA USA
关键词
<bold>CCS Concepts</bold>; center dot <bold>Computing methodologies</bold> -> <bold>Machine learning</bold>; <bold>Texturing</bold>;
D O I
10.1111/cgf.15039
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
High-resolution texture maps are essential to render photoreal digital humans for visual effects or to generate data for machine learning. The acquisition of high resolution assets at scale is cumbersome, it involves enrolling a large number of human subjects, using expensive multi-view camera setups, and significant manual artistic effort to align the textures. To alleviate these problems, we introduce GANtlitz (A play on the german noun Antlitz, meaning face), a generative model that can synthesize multi-modal ultra-high-resolution face appearance maps for novel identities. Our method solves three distinct challenges: 1) unavailability of a very large data corpus generally required for training generative models, 2) memory and computational limitations of training a GAN at ultra-high resolutions, and 3) consistency of appearance features such as skin color, pores and wrinkles in high-resolution textures across different modalities. We introduce dual-style blocks, an extension to the style blocks of the StyleGAN2 architecture, which improve multi-modal synthesis. Our patch-based architecture is trained only on image patches obtained from a small set of face textures (<100) and yet allows us to generate seamless appearance maps of novel identities at 6k x 4k resolution. Extensive qualitative and quantitative evaluations and baseline comparisons show the efficacy of our proposed system. (see )
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Generative Multi-Modal Mutual Enhancement Video Semantic Communications
    Chen, Yuanle
    Wang, Haobo
    Liu, Chunyu
    Wang, Linyi
    Liu, Jiaxin
    Wu, Wei
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 139 (03): : 2985 - 3009
  • [22] MMCRec: Towards Multi-modal Generative AI in Conversational Recommendation
    Mukande, Tendai
    Ali, Esraa
    Caputo, Annalina
    Dong, Ruihai
    O'Connor, Noel E.
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 316 - 325
  • [23] MGCM: Multi-modal generative compatibility modeling for clothing matching
    Liu, Jinhuan
    Song, Xuemeng
    Chen, Zhumin
    Ma, Jun
    NEUROCOMPUTING, 2020, 414 : 215 - 224
  • [24] MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning
    Cui, Wanqing
    Bi, Keping
    Guo, Jiafeng
    Cheng, Xueqi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1178 - 1192
  • [25] Multi-Modal Sarcasm Detection Based on Dual Generative Processes
    Ma, Huiying
    He, Dongxiao
    Wang, Xiaobao
    Jin, Di
    Ge, Meng
    Wang, Longbiao
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 2279 - 2287
  • [26] Multi-modal biometrics system using face and signature
    Lee, DJ
    Kwak, KC
    Min, JO
    Chun, MG
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 4, 2004, 3046 : 828 - 837
  • [27] Generative Multi-Modal Knowledge Retrieval with Large Language Models
    Long, Xinwei
    Zeng, Jiali
    Meng, Fandong
    Ma, Zhiyuan
    Zhang, Kaiyan
    Zhou, Bowen
    Zhou, Jie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18733 - 18741
  • [28] Multi-Modal Generative Models for Learning Epistemic Active Sensing
    Korthals, Timo
    Rudolph, Daniel
    Leitner, Juergen
    Hesse, Marc
    Rueckert, Ulrich
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 3319 - 3325
  • [29] Multi-modal biometrics system using face and signature
    Lee, DJ
    Kwak, KC
    Min, JO
    Chun, MG
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 1, 2004, 3043 : 635 - 644
  • [30] Development of an integrated multi-modal communication robotic face
    Pierce, Brennand
    Kuratate, Takaaki
    Maejima, Akinobu
    Morishima, Shigeo
    Matsusaka, Yosuke
    Durkovic, Marko
    Diepold, Klaus
    Cheng, Gordon
    2012 IEEE WORKSHOP ON ADVANCED ROBOTICS AND ITS SOCIAL IMPACTS (ARSO), 2012, : 104 - +