Photo-realistic image synthesis from lines and appearance with modular modulation

被引:2
|
作者
Luo, Wuyang [1 ]
Yang, Su [1 ]
Zhang, Weishan [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] China Univ Petr Huadong, Qingdao Campus, Qingdao, Peoples R China
关键词
Image Synthesis; Image -to -Image Translation; Feature Fusion; Generative Adversarial Networks;
D O I
10.1016/j.neucom.2022.06.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The image-to-image translation task has made significant progress by relying on conditional generative adversarial networks. However, for many tasks, multiple condition images are required. This paper con-siders a very classic application scenario, using lines and appearance to synthesize photo-realistic images, describing structure and appearance information, respectively, for example, generating realistic face images from portrait drawings and color scribbles, and generating photos from sketches and texture patches. The key to this type of task is how to fuse the two conditional information. We propose an image translation system driven by line and appearance images, introducing a modular architecture for condi-tion fusion. Unlike the previous condition fusion schemes, its main body of the generator is composed of stacked modulation units (MUs). Here, structural features and appearance features are progressively incorporated via cascaded MUs, each of which pays attention to the local regions. The visualization exper-iment shows that such a scheme lets the network automatically learn to decompose the fusion process as multiple sub-steps in latent spaces. Our model produces higher quality results quantitatively and qual-itatively compared to the state-of-the-art method on different tasks and datasets. The ablation study demonstrates the effectiveness of the MUs and intuitively explains the process of feature fusion through visualization.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:81 / 91
页数:11
相关论文
共 50 条
  • [31] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
    Ledig, Christian
    Theis, Lucas
    Huszar, Ferenc
    Caballero, Jose
    Cunningham, Andrew
    Acosta, Alejandro
    Aitken, Andrew
    Tejani, Alykhan
    Totz, Johannes
    Wang, Zehan
    Shi, Wenzhe
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 105 - 114
  • [32] ScribbleEditor: Guided Photo-realistic and Identity-preserving Image Editing with Interactive Scribble
    Hu, Haotian
    Jiang, Bin
    Yang, Chao
    Zhou, Xinjiao
    Huo, Xiaofei
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [33] Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
    Yu, Fanghua
    Gu, Jinjin
    Li, Zheyuan
    Liu, Jinfan
    Kong, Xiangtao
    Wang, Xintao
    He, Jingwen
    Qiao, Yu
    Dong, Chao
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 25669 - 25680
  • [34] Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving
    Cai, Mu
    Zhang, Hong
    Huang, Huijuan
    Geng, Qichuan
    Li, Yixuan
    Huang, Gao
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13910 - 13920
  • [35] High-fidelity facial reconstruction from a single photo using photo-realistic rendering
    Dias, Mariana
    Roche, Alexis
    Fernandes, Margarida
    Orvalho, Veronica
    PROCEEDINGS SIGGRAPH 2022 TALKS, 2022,
  • [36] Photo-realistic representation of anatomical structures for medical education by fusion of volumetric and surface image data
    Wetzel, AW
    Nieder, GL
    Durka-Pelok, G
    Gest, TR
    Pomerantz, SM
    Nave, D
    Czanner, S
    Wagner, L
    Shirey, E
    Deerfield, DW
    32ND APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS, 2004, : 131 - 138
  • [37] Lessons learned from Online classification of photo-realistic computer graphics and photographs
    Ng, Tian-Tsong
    Chang, Shih-Fu
    Tsui, Mao-Pei
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 99 - +
  • [38] The foundations of photo-realistic rendering: From quantum electrodynamics to Maxwell's equations
    Banks, David C.
    Abu-Raddad, Laith
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON GRAPHICS AND VISUALIZATION IN ENGINEERING, 2007, : 137 - +
  • [39] Photo-realistic depth-of-field effects synthesis based on real camera parameters
    Lin, Huei-Yung
    Gu, Kai-Da
    ADVANCES IN VISUAL COMPUTING, PT I, 2007, 4841 : 298 - 309
  • [40] Audio-visual unit selection for the synthesis of photo-realistic talking-heads
    Cosatto, E
    Potamianos, G
    Graf, HP
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 619 - 622