MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production

被引:0
|
作者
Ma, Jian [1 ,3 ]
Wang, Wenguan [2 ]
Yang, Yi [2 ]
Zheng, Feng [1 ]
机构
[1] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[2] Zhejiang Univ, ReLER, CCAI, Hangzhou, Peoples R China
[3] Univ Technol Sydney, ReLER, Ultimo, NSW, Australia
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sign language understanding has made significant strides; however, there is still no viable solution for generating sign sequences directly from entire spoken content, e.g., text or speech. In this paper, we propose a unified framework for continuous sign language production, easing communication between sign and non-sign language users. In particular, a sequence diffusion model, utilizing embeddings extracted from text or speech, is crafted to generate sign predictions step by step. Moreover, by creating a joint embedding space for text, audio, and sign, we bind these modalities and leverage the semantic consistency among them to provide informative feedback for the model training. This embedding-consistency learning strategy minimizes the reliance on sign triplets and ensures continuous model refinement, even with a missing audio modality. Experiments on How2Sign and PHOENIX14T datasets demonstrate that our model achieves competitive performance in sign language production.
引用
收藏
页码:7241 / 7254
页数:14
相关论文
共 24 条
  • [21] Developing a data-driven multimodal injury and illness prevention programme in male professional football based on a risk management model: the IP2 NetWork
    Hassanmirzaei, Bahar
    Schumacher, Yorck
    Tabben, Montassar
    Bahr, Roald
    BMJ OPEN SPORT & EXERCISE MEDICINE, 2024, 10 (04):
  • [22] TIMS2Rescore: A Data Dependent Acquisition-Parallel Accumulation and Serial Fragmentation-Optimized Data-Driven Rescoring Pipeline Based on MS2Rescore
    Declercq, Arthur
    Devreese, Robbe
    Scheid, Jonas
    Jachmann, Caroline
    van den Bossche, Tim
    Preikschat, Annica
    Gomez-Zepeda, David
    Rijal, Jeewan Babu
    Hirschler, Aurelie
    Krieger, Jonathan R.
    Srikumar, Tharan
    Rosenberger, George
    Martelli, Claudia
    Trede, Dennis
    Carapito, Christine
    Tenzer, Stefan
    Walz, Juliane S.
    Degroeve, Sven
    Bouwmeester, Robbin
    Martens, Lennart
    Gabriels, Ralf
    JOURNAL OF PROTEOME RESEARCH, 2025, 24 (03) : 1067 - 1076
  • [23] Identifying FDA-approved drugs with multimodal properties against COVID-19 using a data-driven approach and a lung organoid model of SARS-CoV-2 entry
    Rodrigo R. R. Duarte
    Dennis C. Copertino
    Luis P. Iñiguez
    Jez L. Marston
    Yaron Bram
    Yuling Han
    Robert E. Schwartz
    Shuibing Chen
    Douglas F. Nixon
    Timothy R. Powell
    Molecular Medicine, 2021, 27
  • [24] Identifying FDA-approved drugs with multimodal properties against COVID-19 using a data-driven approach and a lung organoid model of SARS-CoV-2 entry
    Duarte, Rodrigo R. R.
    Copertino, Dennis C., Jr.
    Iniguez, Luis P.
    Marston, Jez L.
    Bram, Yaron
    Han, Yuling
    Schwartz, Robert E.
    Chen, Shuibing
    Nixon, Douglas F.
    Powell, Timothy R.
    MOLECULAR MEDICINE, 2021, 27 (01)