Toward Multi-Modal Conditioned Fashion Image Translation

被引：13

作者：

Gu, Xiaoling ^{[1
]}

Yu, Jun ^{[1
]}

Wong, Yongkang ^{[2
]}

Kankanhalli, Mohan S. ^{[2
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Peoples R China

[2] Natl Univ Singapore, Sch Comp, Singapore 119613, Singapore

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2021年 / 23卷

基金：

美国国家科学基金会; 新加坡国家研究基金会;

关键词：

Generative adversarial network; fashion image synthesis; image-to-image translation; RETRIEVAL;

D O I：

10.1109/TMM.2020.3009500

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Having the capability to synthesize photo-realistic fashion product images conditioned on multiple attributes or modalities would bring many new exciting applications. In this work, we propose an end-to-end network architecture that built upon a new generative adversarial network for automatically synthesizing photo-realistic images of fashion products under multiple conditions. Given an input pose image that consists of a 2D skeleton pose and a sentence description of products, our model synthesizes a fashion image preserving the same pose and wearing the fashion products described as the text. Specifically, the generator G tries to generate realistic-looking fashion images based on a < pose, text > pair condition to fool the discriminator. An attention network is added for enhancing the generator, which predicts a probability map indicating which part of the image needs to be attended for translation. In contrast, the discriminator D distinguishes real images from the translated ones based on the input pose image and text description. The discriminator is divided into two multi-scale sub-discriminators for improving image distinguishing task. Quantitative and qualitative analysis demonstrates that our method is capable of synthesizing realistic images that retain the poses of given images while matching the semantics of provided sentence descriptions.

引用

页码：2361 / 2371

页数：11

共 50 条

[31] Multi-Modal Deformable Medical Image Registration
Fookes, Clinton
Sridharan, Sridha
ICSPCS: 2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, PROCEEDINGS, 2008, : 661 - 669
[32] MULTI-MODAL IMAGE STITCHING WITH NONLINEAR OPTIMIZATION
Saha, Arindam
Maity, Soumyadip
Bhowmick, Brojeshwar
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1987 - 1991
[33] Multi-Modal Image Captioning for the Visually Impaired
Ahsan, Hiba
Bhalla, Nikita
Bhatt, Daivat
Shah, Kaivankumar
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 53 - 60
[34] Multi-modal Image Fusion with KNN Matting
Zhang, Xia
Lin, Hui
Kang, Xudong
Li, Shutao
PATTERN RECOGNITION (CCPR 2014), PT II, 2014, 484 : 89 - 96
[35] MixBERT for Multi-modal Matching in Image Advertising
Yu, Tan
Li, Xiaokang
Xie, Jianwen
Yin, Ruiyang
Xu, Qing
Li, Ping
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3597 - 3602
[36] A Multi-modal SPM Model for Image Classification
Zheng, Peng
Zhao, Zhong-Qiu
Gao, Jun
INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2017, PT III, 2017, 10363 : 525 - 535
[37] An overview of multi-modal medical image fusion
Du, Jiao
Li, Weisheng
Lu, Ke
Xiao, Bin
NEUROCOMPUTING, 2016, 215 : 3 - 20
[38] Multi-modal Learning for Social Image Classification
Liu, Chunyang
Zhang, Xu
Li, Xiong
Li, Rui
Zhang, Xiaoming
Chao, Wenhan
2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1174 - 1179
[39] Self-supervised multi-modal fusion network for multi-modal thyroid ultrasound image diagnosis
Xiang, Zhuo
Zhuo, Qiuluan
Zhao, Cheng
Deng, Xiaofei
Zhu, Ting
Wang, Tianfu
Jiang, Wei
Lei, Baiying
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
[40] Reliable multi-modal medical image-to-image translation independent of pixel-wise aligned data
Zhou, Langrui
Li, Guang
MEDICAL PHYSICS, 2024, 51 (11) : 8283 - 8301

← 1 2 3 4 5 →