Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos

被引:0
|
作者
Zhang, Chaoyang [1 ]
Hua, Yan [1 ]
机构
[1] Commun Univ China, Sch Informat & Commun Engn, 1 Dingfuzhuang East St, Beijing 100024, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Diffusion; Cross-modality; Dance to music; Transformer;
D O I
10.1186/s13636-024-00370-6
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With the rapid development of social networks, short videos have become a popular form of content, especially dance videos. In this context, research on automatically generating music for dance videos shows significant practical value. However, existing studies face challenges such as limited richness in music timbre and lack of synchronization with dance movements. In this paper, we propose Dance2Music-Diffusion, a novel framework for music generation from dance videos using latent diffusion models. Our approach includes a motion encoder module for extracting motion features and a music diffusion generation module for generating latent music representations. By integrating dance type monitoring and latent diffusion techniques, our framework outperforms existing methods in generating complex and rich dance music. We conducted objective and subjective evaluations of the results produced by various existing models on the AIST++ dataset. Our framework shows outstanding performance in terms of beat recall rate, consistency with GT beats, and coordination with dance movements. This work represents the state of the art in automatic music generation from dance videos, is easy to train, and has implications for enhancing entertainment experiences and inspiring innovative dance productions. Sample videos of our generated music and dance can be viewed at https://youtu.be/eCvLdLdkX-Y. The code is available at https://github.com/hellto/dance2music-diffusion.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Motion to Dance Music Generation using Latent Diffusion Model
    Tan, Vanessa
    Nam, JungHyun
    Nam, Juhan
    Noh, Junyong
    PROCEEDINGS SIGGRAPH ASIA 2023 TECHNICAL COMMUNICATIONS, SA TECHNICAL COMMUNICATIONS 2023, 2023,
  • [2] Quantized GAN for Complex Music Generation from Dance Videos
    Zhu, Ye
    Olszewski, Kyle
    Wu, Yu
    Achlioptas, Panos
    Chai, Menglei
    Yan, Yan
    Tulyakov, Sergey
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 182 - 199
  • [3] Diffusion of Ecstasy in the Electronic Dance Music Scene
    Palamar, Joseph J.
    SUBSTANCE USE & MISUSE, 2020, 55 (13) : 2243 - 2250
  • [4] Music2Dance: DanceNet for Music-Driven Dance Generation
    Zhuang, Wenlin
    Wang, Congyi
    Chai, Jinxiang
    Wang, Yangang
    Shao, Ming
    Xia, Siyu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (02)
  • [5] Update (Dance videos and keyboard music)
    Hilley, M
    CLAVIER, 1997, 36 (09): : 48 - 48
  • [6] Discrete diffusion model with contrastive learning for music to natural and long dance generation
    Huaxin Wang
    Yujian Jiang
    Xiangzhong Zhou
    Wei Jiang
    npj Heritage Science, 13 (1):
  • [7] THE REMIX GENERATION + DANCE MUSIC
    WATNEY, S
    ARTFORUM, 1994, 33 (02): : 15 - 16
  • [8] EDGE: Editable Dance Generation From Music
    Tseng, Jonathan
    Castellon, Rodrigo
    Liu, C. Karen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 448 - 458
  • [9] DanceU: motion-and-music-based automatic effect generation for dance videos
    Pan, Yanjie
    Du, Yaru
    Wang, Shandong
    Ye, Yun
    Jiang, Yong
    Zhou, Zhen
    Xu, Li
    Lu, Ming
    Lin, Yunbiao
    Lu, Jiehui
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2093 - 2098
  • [10] Music-Driven Dance Generation
    Qi, Yu
    Liu, Yazhou
    Sun, Quansen
    IEEE ACCESS, 2019, 7 : 166540 - 166550