Multimedia event extraction based on multimodal low-dimensional feature representation space

被引:0
|
作者
Cui, Yiming [1 ]
Sun, Bin [1 ]
Jiang, Tao [1 ]
Cui, Hongrui [1 ]
机构
[1] Northwest Minzu Univ, Key Lab Language & Cultural Comp, Minist Educ, Natl Languages Informat Technol, Lanzhou 730000, Gansu, Peoples R China
基金
中央高校基本科研业务费专项资金资助;
关键词
Multimedia event extraction; Multimodal representation learning; Contrast learning; Momentum distillation; Image description generation;
D O I
10.1007/s11760-025-03999-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, research on multimedia event extraction has emerged. However, due to the lack of support from large-scale annotated datasets, most of the existing studies rely on weakly supervised methods from different datasets in the training and testing phases, which inevitably leads to event extraction being affected by dataset distribution differences and noise. Meanwhile, although modal fusion can effectively model the correlation and complementarity between different modalities, this process may also introduce more noise, which may affect the extraction results. To address the above problems, we propose a multimedia event extraction method based on multimodal low-dimensional feature representation space (MLDFR), which pays more attention to the handling of noise interference during the multimodal fusion process. On the one hand, MLDFR combines contrast learning and momentum distillation techniques to construct a low-dimensional feature representation space, which enhances the model's ability to match text and images in the representation space, and effectively mitigates the interference of dataset noise on multimodal information fusion. On the other hand, in the visual event extraction process, MLDFR not only fuses the corresponding textual events as additional features, but also generates the corresponding image descriptions through the generative model and integrates them into the extraction process as further complementary features to better model the inter-modal correlations. Several experimental results based on the benchmark dataset show that the proposed MLDFR method can significantly improve the performance of multimedia event extraction.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Krigings over space and time based on latent low-dimensional structures
    Da Huang
    Qiwei Yao
    Rongmao Zhang
    Science China Mathematics, 2021, 64 : 823 - 848
  • [42] Low-Dimensional Subject Representation-Based Transfer Learning in EEG Decoding
    Jeng, Po-Yuan
    Wei, Chun-Shu
    Jung, Tzyy-Ping
    Wang, Li-Chun
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (06) : 1915 - 1925
  • [43] EMG-based teleoperation of a robot arm using low-dimensional representation
    Artemiadis, Panagiotis K.
    Kyriakopoulos, Kostas J.
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 495 - 501
  • [44] Character classification algorithm based on the low-dimensional feature-optimized model
    Zhou, Kun
    Ge, Qianqian
    Wei, Cuncun
    Li, Yafeng
    Ni, Haiyan
    Zou, Jie
    Jian, Jiawen
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (02) : 543 - 550
  • [45] Character classification algorithm based on the low-dimensional feature-optimized model
    Kun Zhou
    Qianqian Ge
    Cuncun Wei
    Yafeng Li
    Haiyan Ni
    Jie Zou
    Jiawen Jian
    Signal, Image and Video Processing, 2022, 16 : 543 - 550
  • [46] Learning Contrastive Embedding in Low-Dimensional Space
    Chen, Shuo
    Gong, Chen
    Li, Jun
    Yang, Jian
    Niu, Gang
    Sugiyama, Masashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [47] Embedding gene sets in low-dimensional space
    Hoinka, Jan
    Przytycka, Teresa M.
    NATURE MACHINE INTELLIGENCE, 2020, 2 (07) : 367 - 368
  • [48] Embedding gene sets in low-dimensional space
    Jan Hoinka
    Teresa M. Przytycka
    Nature Machine Intelligence, 2020, 2 : 367 - 368
  • [49] Empirical low-dimensional manifolds in composition space
    Yang, Yue
    Pope, Stephen B.
    Chen, Jacqueline H.
    COMBUSTION AND FLAME, 2013, 160 (10) : 1967 - 1980
  • [50] Discriminative feature extraction based on sparse and low-rank representation
    Liu, Zhonghua
    Ou, Weihua
    Lu, Wenpeng
    Wang, Lin
    NEUROCOMPUTING, 2019, 362 : 129 - 138