Text-based Sequential Image Generation

被引:0
|
作者
Efimova, Valeria [1 ,2 ]
Filchenkov, Andrey [1 ]
机构
[1] ITMO Univ, St Petersburg, Russia
[2] Statanly Technol, St Petersburg, Russia
关键词
text-to-image generation; transformer; layout generation;
D O I
10.1117/12.2622734
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Despite recent impressive results of generative adversarial networks on text-to-image generation, the generation of complex scenes with multiple objects in the complicated background remains challenging; moreover, end-to-end text-to-image generation still suffers from poor image quality. In this work, we propose a sequential algorithm of text-to-image generation, which allows synthesizing high-quality images (more than 1024x1024 pixels). The proposed approach consists of location inference, key objects extraction, image search, layout generation, and image harmonization stages. We compare the suggested approach with state-of-the-art image generation model DALL-E with text-to-image mapping. Our approach demonstrates the effectiveness and visual plausibility of the generated images based on golden section layouts.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Text-Based and Content-Based Image Retrieval on Flickr: DEMO
    Manuel Barrios, Juan
    Diaz-Espinoza, Diego
    Bustos, Benjamin
    SISAP 2009: 2009 SECOND INTERNATIONAL WORKSHOP ON SIMILARITY SEARCH AND APPLICATIONS, PROCEEDINGS, 2009, : 156 - 157
  • [32] Towards the generation of a text-based IDE from a language metamodel
    Kleppe, Anneke
    MODEL DRIVEN ARCHITECTURE - FOUNDATIONS AND APPLICATIONS, 2007, 4530 : 114 - 129
  • [33] Exploiting Visual Concepts to Improve Text-Based Image Retrieval
    Tollari, Sabrina
    Detyniecki, Marcin
    Marsala, Christophe
    Fakeri-Tabrizi, Ali
    Amini, Massih-Reza
    Gallinari, Patrick
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 701 - 705
  • [34] Spam detection proposal in regular and text-based image emails
    Issac, Biju
    Raman, Valliappan
    TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 1624 - +
  • [35] Imagic: Text-Based Real Image Editing with Diffusion Models
    Kawar, Bahjat
    Zada, Shiran
    Lang, Oran
    Tov, Omer
    Chang, Huiwen
    Dekel, Tali
    Mosseri, Inbar
    Irani, Michal
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6007 - 6017
  • [36] Text-based informatics
    Valdes-Perez, RE
    SCIENTIST, 1998, 12 (14): : 10 - 10
  • [37] Document Expansion for Text-Based Image Retrieval at CLEF 2009
    Min, Jinming
    Wilkins, Peter
    Leveling, Johannes
    Jones, Gareth J. F.
    MULTILINGUAL INFORMATION ACCESS EVALUATION II: MULTIMEDIA EXPERIMENTS, PT II, 2010, 6242 : 172 - 176
  • [38] Detected text-based image retrieval approach for textual images
    Unar, Salahuddin
    Wang, Xingyuan
    Zhang, Chuan
    Wang, Chunpeng
    IET IMAGE PROCESSING, 2019, 13 (03) : 515 - 521
  • [39] Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation
    Haidar, Md. Akmal
    Rezagholizadeh, Mehdi
    Do-Omri, Alan
    Rashid, Ahmad
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2248 - 2258
  • [40] Text2Face: Text-Based Face Generation With Geometry and Appearance Control
    Zhang, Zhaoyang
    Chen, Junliang
    Fu, Hongbo
    Zhao, Jianjun
    Chen, Shu-Yu
    Gao, Lin
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (09) : 6481 - 6492