A Markov Chain approach for video-based virtual try-on with denoising diffusion generative adversarial network

被引:1
|
作者
Hou, Jue [1 ,2 ]
Lu, Yinwen [1 ,2 ]
Wang, Mingjie [3 ]
Ouyang, Wenbing [4 ]
Yang, Yang [1 ,2 ]
Zou, Fengyuan [1 ,2 ]
Gu, Bingfei [1 ,2 ]
Liu, Zheng [2 ,5 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Fash Design & Engn, CN-310018 Hangzhou, Zhejiang, Peoples R China
[2] Minist Culture & Tourism, Key Lab Silk Culture Heritage & Prod Design Digita, CN-310018 Hangzhou, Zhejiang, Peoples R China
[3] Zhejiang Sci Tech Univ, Sch Sci, Dept Math, CN-310018 Hangzhou, Zhejiang, Peoples R China
[4] Amazon Inc, 410 Terry Ave N, Seattle, WA 98109 USA
[5] Zhejiang Sci Tech Univ, Sch Int Educ, CN-310018 Hangzhou, Zhejiang, Peoples R China
关键词
Markov Chain; Diffusion model; Video synthesis; Virtual try -on;
D O I
10.1016/j.knosys.2024.112233
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based virtual try-ons have attracted unprecedented attention owing to the development of e-commerce. However, this problem is very challenging because of the arbitrary poses of persons and the demand for temporary consistency of frames, particularly when attempting to synthesize high-quality virtual try-on videos using single images. Specifically, there are two key challenges. 1) The existing video-based virtual try-on methods are based on generative adversarial networks (GAN), which are limited by unstable training and a lack of realism in generated details. 2) The explicit building of stronger constraints of generated frames, which aims to increase the coherence of generated videos. To address these challenges, this study proposed a novel framework, Extended Markov Chain Based Denoising Diffusion Generative Adversarial Network (EMC-DDGAN), which was derived from a denoising diffusion GAN, which is a diffusion model with efficient sampling. Moreover, we proposed an extended Markov chain that used a diffusion model to synthesize frames via sequential generation. With a carefully designed network and learning objects, the proposed approach achieved outstanding performance on public datasets. Rigorous experiments demonstrated that EMC-DDGAN could synthesize higher-quality videos compared to other state-of-the-art methods and validated the effectiveness of the proposed approach.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] VTON-SCFA: A Virtual Try-On Network Based on the Semantic Constraints and Flow Alignment
    Du, Chenghu
    Yu, Feng
    Jiang, Minghua
    Hua, Ailing
    Wei, Xiong
    Peng, Tao
    Hu, Xinrong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 777 - 791
  • [32] VTNFP: An Image-based Virtual Try-on Network with Body and Clothing Feature Preservation
    Yu, Ruiyun
    Wang, Xiaoqi
    Xie, Xiaohui
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 10510 - 10519
  • [33] Compressed Video Sensing Based on Deep Generative Adversarial Network
    Nezhad, Valiyeh Ansarian
    Azghani, Masoumeh
    Marvasti, Farokh
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (08) : 5048 - 5064
  • [34] Detecting Anomaly Event in Video Based on Generative Adversarial Network
    Zhang, Zhaoxian
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [35] Conditional Generative Adversarial Network-Based Image Denoising for Defending Against Adversarial Attack
    Zhang, Haibo
    Sakurai, Kouichi
    IEEE ACCESS, 2021, 9 : 169031 - 169043
  • [36] D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-On
    Yang, Zhaotong
    Jiang, Zicheng
    Li, Xinzhe
    Zhou, Huiyu
    Dong, Junyu
    Zhang, Huaidong
    Du, Yong
    COMPUTER VISION-ECCV 2024, PT XLVI, 2025, 15104 : 36 - 52
  • [37] A data balancing approach based on generative adversarial network
    Yuan, Lixiang
    Yu, Siyang
    Yang, Zhibang
    Duan, Mingxing
    Li, Kenli
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 141 : 768 - 776
  • [38] SP-VITON: shape-preserving image-based virtual try-on network
    Dan Song
    Tianbao Li
    Zhendong Mao
    An-An Liu
    Multimedia Tools and Applications, 2020, 79 : 33757 - 33769
  • [39] C-VTON: Context-Driven Image-Based Virtual Try-On Network
    Fele, Benjamin
    Lampe, Ajda
    Peer, Peter
    Struc, Vitomir
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2203 - 2212
  • [40] A Novel Medical Image Denoising Method Based on Conditional Generative Adversarial Network
    Li, Yuqin
    Zhang, Ke
    Shi, Weili
    Miao, Yu
    Jiang, Zhengang
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021