Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation

被引:0
|
作者
Quang Nguyen [1 ,2 ]
Truong Vu [1 ]
Anh Tran [1 ]
Khoi Nguyen [1 ]
机构
[1] VinAI Res, Ho Chi Minh City, Vietnam
[2] Ho Chi Minh City Univ Technol, VNU HCM, Ho Chi Minh City, Vietnam
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Preparing training data for deep vision models is a labor-intensive task. To address this, generative models have emerged as an effective solution for generating synthetic data. While current generative models produce image-level category labels, we propose a novel method for generating pixel-level semantic segmentation labels using the text-to-image generative model Stable Diffusion (SD). By utilizing the text prompts, cross-attention, and self-attention of SD, we introduce three new techniques: class-prompt appending, class-prompt cross-attention, and self-attention exponentiation. These techniques enable us to generate segmentation maps corresponding to synthetic images. These maps serve as pseudo-labels for training semantic segmenters, eliminating the need for labor-intensive pixel-wise annotation. To account for the imperfections in our pseudo-labels, we incorporate uncertainty regions into the segmentation, allowing us to disregard loss from those regions. We conduct evaluations on two datasets, PASCAL VOC and MSCOCO, and our approach significantly outperforms concurrent work. Our benchmarks and code will be released at https://github.com/VinAIResearch/Dataset-Diffusion.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] A Diffusion-based Reconstruction Technique for Single Pixel Camera
    Guven, Baturalp
    Gungor, Alper
    Bahceci, M. Umut
    Cukur, Tolga
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [22] A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations
    Ioannou, Eleftherios
    Thalatam, Sainath
    Georgescu, Serban
    SCIENTIFIC DATA, 2024, 11 (01)
  • [23] SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines
    Hu, Yuan-Ting X.
    Chen, Hong-Shuo
    Hui, Kexin
    Huang, Jia-Bin
    Schwing, Alexander
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3100 - 3110
  • [24] Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation
    Kang, Guoliang
    Wei, Yunchao
    Yang, Yi
    Zhuang, Yueting
    Hauptmann, Alexander G.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [25] Faithful crack image synthesis from evolutionary pixel-level annotations via latent semantic diffusion model
    Lei, Qin
    Zhong, Jiang
    Dong, Mianxiong
    Ota, Kaoru
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 275
  • [26] Generation of Smoke Dataset for Power Equipment and Study of Image Semantic Segmentation
    Chang, Rong
    Mao, Zhengxiong
    Hu, Jian
    Bai, Haicheng
    Pan, Anning
    Yang, Yang
    Gao, Shan
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2024, 2024
  • [27] High Efficiency Dataset Generation for Semantic Video Segmentation on Road Intersection
    Nagai, Wataru
    Katayama, Takafumi
    Song, Tian
    Shimamoto, Takashi
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 372 - 375
  • [28] Diffusion-Based Wireless Semantic Communication for VR Image
    Zhang, Haoming
    Bao, Zhicheng
    Liang, Haotai
    Liu, Yucheng
    Dong, Chen
    Li, Lin
    IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA, ICCC WORKSHOPS 2024, 2024, : 639 - 644
  • [29] The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes
    Ros, German
    Sellart, Laura
    Materzynska, Joanna
    Vazquez, David
    Lopez, Antonio M.
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3234 - 3243
  • [30] SELF ATTENTION BASED SEMANTIC SEGMENTATION ON A NATURAL DISASTER DATASET
    Chowdhury, Tashnim
    Rahnemoonfar, Maryam
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2798 - 2802