Multimedia event extraction based on multimodal low-dimensional feature representation space

被引:0
|
作者
Cui, Yiming [1 ]
Sun, Bin [1 ]
Jiang, Tao [1 ]
Cui, Hongrui [1 ]
机构
[1] Northwest Minzu Univ, Key Lab Language & Cultural Comp, Minist Educ, Natl Languages Informat Technol, Lanzhou 730000, Gansu, Peoples R China
基金
中央高校基本科研业务费专项资金资助;
关键词
Multimedia event extraction; Multimodal representation learning; Contrast learning; Momentum distillation; Image description generation;
D O I
10.1007/s11760-025-03999-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, research on multimedia event extraction has emerged. However, due to the lack of support from large-scale annotated datasets, most of the existing studies rely on weakly supervised methods from different datasets in the training and testing phases, which inevitably leads to event extraction being affected by dataset distribution differences and noise. Meanwhile, although modal fusion can effectively model the correlation and complementarity between different modalities, this process may also introduce more noise, which may affect the extraction results. To address the above problems, we propose a multimedia event extraction method based on multimodal low-dimensional feature representation space (MLDFR), which pays more attention to the handling of noise interference during the multimodal fusion process. On the one hand, MLDFR combines contrast learning and momentum distillation techniques to construct a low-dimensional feature representation space, which enhances the model's ability to match text and images in the representation space, and effectively mitigates the interference of dataset noise on multimodal information fusion. On the other hand, in the visual event extraction process, MLDFR not only fuses the corresponding textual events as additional features, but also generates the corresponding image descriptions through the generative model and integrates them into the extraction process as further complementary features to better model the inter-modal correlations. Several experimental results based on the benchmark dataset show that the proposed MLDFR method can significantly improve the performance of multimedia event extraction.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Low-Dimensional Sensory Feature Representation by Trigeminal Primary Afferents
    Bale, Michael R.
    Davies, Kyle
    Freeman, Oliver J.
    Ince, Robin A. A.
    Petersen, Rasmus S.
    JOURNAL OF NEUROSCIENCE, 2013, 33 (29): : 12003 - 12012
  • [2] Facial-expression recognition based on a low-dimensional temporal feature space
    Ben Abdallah, Taoufik
    Guermazi, Radhouane
    Hammami, Mohamed
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (15) : 19455 - 19479
  • [3] Facial-expression recognition based on a low-dimensional temporal feature space
    Taoufik Ben Abdallah
    Radhouane Guermazi
    Mohamed Hammami
    Multimedia Tools and Applications, 2018, 77 : 19455 - 19479
  • [4] FEATURE-EXTRACTION OF POLYSACCHARIDES BY LOW-DIMENSIONAL INTERNAL REPRESENTATION NEURAL NETWORKS AND INFRARED-SPECTROSCOPY
    JACOBSSON, SP
    ANALYTICA CHIMICA ACTA, 1994, 291 (1-2) : 19 - 27
  • [5] LOW-DIMENSIONAL REPRESENTATION OF FACES IN HIGHER DIMENSIONS OF THE FACE SPACE
    OTOOLE, AJ
    ABDI, H
    DEFFENBACHER, KA
    VALENTIN, D
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1993, 10 (03): : 405 - 411
  • [7] Transformation of measurement uncertainties into low-dimensional feature vector space
    Alexiadis, A.
    Ferson, S.
    Patterson, E. A.
    ROYAL SOCIETY OPEN SCIENCE, 2021, 8 (03):
  • [8] Human Actions Modelling and Recognition in Low-dimensional Feature Space
    Hachaj, Tomasz
    Ogiela, Marek R.
    Koptyra, Katarzyna
    2015 10TH INTERNATIONAL CONFERENCE ON BROADBAND AND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS (BWCCA 2015), 2015, : 247 - 254
  • [9] Low-dimensional topology, low-dimensional field theory and representation theory
    Fuchs, Juergen
    Schweigert, Christoph
    REPRESENTATION THEORY - CURRENT TRENDS AND PERSPECTIVES, 2017, : 255 - 267
  • [10] A Robust Tracking with Low-Dimensional Target-Specific Feature Extraction
    Jiang, Chengcheng
    Zhu, Xinyu
    Li, Chao
    Chen, Gengsheng
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (07) : 1349 - 1361