Text-based Sequential Image Generation

被引：0

作者：

Efimova, Valeria ^{[1
,2
]}

Filchenkov, Andrey ^{[1
]}

机构：

[1] ITMO Univ, St Petersburg, Russia

[2] Statanly Technol, St Petersburg, Russia

来源：

FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021) | 2022年 / 12084卷

关键词：

text-to-image generation; transformer; layout generation;

D O I：

10.1117/12.2622734

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Despite recent impressive results of generative adversarial networks on text-to-image generation, the generation of complex scenes with multiple objects in the complicated background remains challenging; moreover, end-to-end text-to-image generation still suffers from poor image quality. In this work, we propose a sequential algorithm of text-to-image generation, which allows synthesizing high-quality images (more than 1024x1024 pixels). The proposed approach consists of location inference, key objects extraction, image search, layout generation, and image harmonization stages. We compare the suggested approach with state-of-the-art image generation model DALL-E with text-to-image mapping. Our approach demonstrates the effectiveness and visual plausibility of the generated images based on golden section layouts.

引用

页数：8

共 50 条

[31] Text-Based and Content-Based Image Retrieval on Flickr: DEMO
Manuel Barrios, Juan
Diaz-Espinoza, Diego
Bustos, Benjamin
SISAP 2009: 2009 SECOND INTERNATIONAL WORKSHOP ON SIMILARITY SEARCH AND APPLICATIONS, PROCEEDINGS, 2009, : 156 - 157
[32] Towards the generation of a text-based IDE from a language metamodel
Kleppe, Anneke
MODEL DRIVEN ARCHITECTURE - FOUNDATIONS AND APPLICATIONS, 2007, 4530 : 114 - 129
[33] Exploiting Visual Concepts to Improve Text-Based Image Retrieval
Tollari, Sabrina
Detyniecki, Marcin
Marsala, Christophe
Fakeri-Tabrizi, Ali
Amini, Massih-Reza
Gallinari, Patrick
ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 701 - 705
[34] Spam detection proposal in regular and text-based image emails
Issac, Biju
Raman, Valliappan
TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 1624 - +
[35] Imagic: Text-Based Real Image Editing with Diffusion Models
Kawar, Bahjat
Zada, Shiran
Lang, Oran
Tov, Omer
Chang, Huiwen
Dekel, Tali
Mosseri, Inbar
Irani, Michal
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6007 - 6017
[36] Text-based informatics
Valdes-Perez, RE
SCIENTIST, 1998, 12 (14): : 10 - 10
[37] Document Expansion for Text-Based Image Retrieval at CLEF 2009
Min, Jinming
Wilkins, Peter
Leveling, Johannes
Jones, Gareth J. F.
MULTILINGUAL INFORMATION ACCESS EVALUATION II: MULTIMEDIA EXPERIMENTS, PT II, 2010, 6242 : 172 - 176
[38] Detected text-based image retrieval approach for textual images
Unar, Salahuddin
Wang, Xingyuan
Zhang, Chuan
Wang, Chunpeng
IET IMAGE PROCESSING, 2019, 13 (03) : 515 - 521
[39] Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation
Haidar, Md. Akmal
Rezagholizadeh, Mehdi
Do-Omri, Alan
Rashid, Ahmad
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2248 - 2258
[40] Text2Face: Text-Based Face Generation With Geometry and Appearance Control
Zhang, Zhaoyang
Chen, Junliang
Fu, Hongbo
Zhao, Jianjun
Chen, Shu-Yu
Gao, Lin
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (09) : 6481 - 6492

← 1 2 3 4 5 →