Photo-realistic image synthesis from lines and appearance with modular modulation

被引：2

作者：

Luo, Wuyang ^{[1
]}

Yang, Su ^{[1
]}

Zhang, Weishan ^{[2
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China

[2] China Univ Petr Huadong, Qingdao Campus, Qingdao, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 503卷

关键词：

Image Synthesis; Image -to -Image Translation; Feature Fusion; Generative Adversarial Networks;

D O I：

10.1016/j.neucom.2022.06.007

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The image-to-image translation task has made significant progress by relying on conditional generative adversarial networks. However, for many tasks, multiple condition images are required. This paper con-siders a very classic application scenario, using lines and appearance to synthesize photo-realistic images, describing structure and appearance information, respectively, for example, generating realistic face images from portrait drawings and color scribbles, and generating photos from sketches and texture patches. The key to this type of task is how to fuse the two conditional information. We propose an image translation system driven by line and appearance images, introducing a modular architecture for condi-tion fusion. Unlike the previous condition fusion schemes, its main body of the generator is composed of stacked modulation units (MUs). Here, structural features and appearance features are progressively incorporated via cascaded MUs, each of which pays attention to the local regions. The visualization exper-iment shows that such a scheme lets the network automatically learn to decompose the fusion process as multiple sub-steps in latent spaces. Our model produces higher quality results quantitatively and qual-itatively compared to the state-of-the-art method on different tasks and datasets. The ablation study demonstrates the effectiveness of the MUs and intuitively explains the process of feature fusion through visualization.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：81 / 91

页数：11

共 50 条

[31] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Ledig, Christian
Theis, Lucas
Huszar, Ferenc
Caballero, Jose
Cunningham, Andrew
Acosta, Alejandro
Aitken, Andrew
Tejani, Alykhan
Totz, Johannes
Wang, Zehan
Shi, Wenzhe
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 105 - 114
[32] ScribbleEditor: Guided Photo-realistic and Identity-preserving Image Editing with Interactive Scribble
Hu, Haotian
Jiang, Bin
Yang, Chao
Zhou, Xinjiao
Huo, Xiaofei
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
[33] Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Yu, Fanghua
Gu, Jinjin
Li, Zheyuan
Liu, Jinfan
Kong, Xiangtao
Wang, Xintao
He, Jingwen
Qiao, Yu
Dong, Chao
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 25669 - 25680
[34] Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving
Cai, Mu
Zhang, Hong
Huang, Huijuan
Geng, Qichuan
Li, Yixuan
Huang, Gao
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13910 - 13920
[35] High-fidelity facial reconstruction from a single photo using photo-realistic rendering
Dias, Mariana
Roche, Alexis
Fernandes, Margarida
Orvalho, Veronica
PROCEEDINGS SIGGRAPH 2022 TALKS, 2022,
[36] Photo-realistic representation of anatomical structures for medical education by fusion of volumetric and surface image data
Wetzel, AW
Nieder, GL
Durka-Pelok, G
Gest, TR
Pomerantz, SM
Nave, D
Czanner, S
Wagner, L
Shirey, E
Deerfield, DW
32ND APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS, 2004, : 131 - 138
[37] Lessons learned from Online classification of photo-realistic computer graphics and photographs
Ng, Tian-Tsong
Chang, Shih-Fu
Tsui, Mao-Pei
2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 99 - +
[38] The foundations of photo-realistic rendering: From quantum electrodynamics to Maxwell's equations
Banks, David C.
Abu-Raddad, Laith
PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON GRAPHICS AND VISUALIZATION IN ENGINEERING, 2007, : 137 - +
[39] Photo-realistic depth-of-field effects synthesis based on real camera parameters
Lin, Huei-Yung
Gu, Kai-Da
ADVANCES IN VISUAL COMPUTING, PT I, 2007, 4841 : 298 - 309
[40] Audio-visual unit selection for the synthesis of photo-realistic talking-heads
Cosatto, E
Potamianos, G
Graf, HP
2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 619 - 622

← 1 2 3 4 5 →