ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

被引:13
|
作者
Zhang, Mingyuan [1 ]
Guo, Xinying [1 ]
Pan, Liang [1 ]
Cai, Zhongang [1 ,2 ]
Hong, Fangzhou [1 ]
Li, Huirong [1 ]
Yang, Lei [2 ]
Liu, Ziwei [1 ]
机构
[1] Nanyang Technol Univ, S Lab, Singapore, Singapore
[2] Sensetime, Shanghai, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.00040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the performance on more diverse motions remains unsatisfactory. In this work, we propose ReMoDiffuse, a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process. ReMoDiffuse enhances the generalizability and diversity of text-driven motion generation with three key designs: 1) Hybrid Retrieval finds appropriate references from the database in terms of both semantic and kinematic similarities. 2) Semantic-Modulated Transformer selectively absorbs retrieval knowledge, adapting to the difference between retrieved samples and the target motion sequence. 3) Condition Mixture better utilizes the retrieval database during inference, overcoming the scale sensitivity in classifier-free guidance. Extensive experiments demonstrate that ReMoDiffuse outperforms state-of-the-art methods by balancing both text-motion consistency and motion quality, especially for more diverse motion generation. Project page: https://mingyuan-zhang.github.io/projects/ReMoDiffuse.html
引用
收藏
页码:364 / 373
页数:10
相关论文
共 50 条
  • [1] Retrieval-Augmented Diffusion Models
    Blattmann, Andreas
    Rombach, Robin
    Oktay, Kaan
    Mueller, Jonas
    Ommer, Bjoern
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Retrieval-augmented Image Captioning
    Ramos, Rita
    Elliott, Desmond
    Martins, Bruno
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3666 - 3681
  • [3] Evaluating Retrieval Quality in Retrieval-Augmented Generation
    Salemi, Alireza
    Zamani, Hamed
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2395 - 2400
  • [4] Building a Coding Assistant via the Retrieval-Augmented Language Model
    Li, Xinze
    Wang, Hanbin
    Liu, Zhenghao
    Yu, Shi
    Wang, Shuo
    Yan, Yukun
    Fu, Yukai
    Gu, Yu
    Yu, Ge
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)
  • [5] REALM: Retrieval-Augmented Language Model Pre-Training
    Guu, Kelvin
    Lee, Kenton
    Tung, Zora
    Pasupat, Panupong
    Chang, Ming-Wei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [6] Benchmarking Retrieval-Augmented Generation for Medicine
    Xiong, Guangzhi
    Jin, Qiao
    Lu, Zhiyong
    Zhang, Aidong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 6233 - 6251
  • [7] Retrieval-Augmented Transformer for Image Captioning
    Sarto, Sara
    Cornia, Marcella
    Baraldi, Lorenzo
    Cucchiara, Rita
    19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 1 - 7
  • [8] RECAP: RETRIEVAL-AUGMENTED AUDIO CAPTIONING
    Ghosh, Sreyan
    Kumar, Sonal
    Evuru, Chandra Kiran Reddy
    Duraiswami, Ramani
    Manocha, Dinesh
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1161 - 1165
  • [9] Retrieval-Augmented Multiple Instance Learning
    Cui, Yufei
    Liu, Ziquan
    Chen, Yixin
    Lu, Yuchen
    Yu, Xinyue
    Liu, Xue
    Kuo, Tei-Wei
    Rodrigues, Miguel R. D.
    Xue, Chun Jason
    Chan, Antoni B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Enhanced Recommendation Systems with Retrieval-Augmented Large Language Model
    Wei, Chuyuan
    Duan, Ke
    Zhuo, Shengda
    Wang, Hongchun
    Huang, Shuqiang
    Liu, Jie
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2025, 82 : 1147 - 1173