ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

被引:13
|
作者
Zhang, Mingyuan [1 ]
Guo, Xinying [1 ]
Pan, Liang [1 ]
Cai, Zhongang [1 ,2 ]
Hong, Fangzhou [1 ]
Li, Huirong [1 ]
Yang, Lei [2 ]
Liu, Ziwei [1 ]
机构
[1] Nanyang Technol Univ, S Lab, Singapore, Singapore
[2] Sensetime, Shanghai, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.00040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the performance on more diverse motions remains unsatisfactory. In this work, we propose ReMoDiffuse, a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process. ReMoDiffuse enhances the generalizability and diversity of text-driven motion generation with three key designs: 1) Hybrid Retrieval finds appropriate references from the database in terms of both semantic and kinematic similarities. 2) Semantic-Modulated Transformer selectively absorbs retrieval knowledge, adapting to the difference between retrieved samples and the target motion sequence. 3) Condition Mixture better utilizes the retrieval database during inference, overcoming the scale sensitivity in classifier-free guidance. Extensive experiments demonstrate that ReMoDiffuse outperforms state-of-the-art methods by balancing both text-motion consistency and motion quality, especially for more diverse motion generation. Project page: https://mingyuan-zhang.github.io/projects/ReMoDiffuse.html
引用
收藏
页码:364 / 373
页数:10
相关论文
共 50 条
  • [31] GOODTRIEVER: Adaptive Toxicity Mitigation with Retrieval-augmented Models
    Pozzobon, Luiza
    Ermis, Beyza
    Lewis, Patrick
    Hooker, Sara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5108 - 5125
  • [32] The Journey to A Knowledgeable Assistant with Retrieval-Augmented Generation (RAG)
    Dong, Xin Luna
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 4 - 4
  • [33] Diversify Question Generation with Retrieval-Augmented Style Transfer
    Gou, Qi
    Xia, Zehua
    Yu, Bowen
    Yu, Haiyang
    Huang, Fei
    Li, Yongbin
    Nguyen, Cam-Tu
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1677 - 1690
  • [34] Revisiting and Improving Retrieval-Augmented Deep Assertion Generation
    Sun, Weifeng
    Li, Hongyan
    Yan, Meng
    Lei, Yan
    Zhang, Hongyu
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1123 - 1135
  • [35] Retrieval-Augmented Few-shot Text Classification
    Yu, Guoxin
    Liu, Lemao
    Jiang, Haiyun
    Shi, Shuming
    Ao, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6721 - 6735
  • [36] Web Application for Retrieval-Augmented Generation: Implementation and Testing
    Radeva, Irina
    Popchev, Ivan
    Doukovska, Lyubka
    Dimitrova, Miroslava
    ELECTRONICS, 2024, 13 (07)
  • [37] Performance Evaluation of Vector Embeddings with Retrieval-Augmented Generation
    Kukreja, Sanjay
    Kumar, Tarun
    Bharate, Vishal
    Purohit, Amit
    Dasgupta, Abhijit
    Guha, Debashis
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 333 - 340
  • [38] ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction
    Zhang, Yue
    Fei, Hongliang
    Li, Ping
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2257 - 2262
  • [39] RA-CFGPT: Chinese financial assistant with retrieval-augmented large language model
    Li, Jiangtong
    Lei, Yang
    Bian, Yuxuan
    Cheng, Dawei
    Ding, Zhijun
    Jiang, Changjun
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
  • [40] Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model
    Muludi, Kurnia
    Fitria, Kaira Milani
    Triloka, Joko
    Sutedi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (03) : 776 - 785