Multi-Level Cross-Modal Interactive-Network-Based Semi-Supervised Multi-Modal Ship Classification

被引:0
|
作者
Song, Xin [1 ]
Chen, Zhikui [1 ]
Zhong, Fangming [1 ]
Gao, Jing [1 ]
Zhang, Jianning [1 ]
Li, Peng [1 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian 116621, Peoples R China
关键词
ship classification; deep multi-modal learning; semi-supervised learning; IMAGE FUSION;
D O I
10.3390/s24227298
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Ship image classification identifies the type of ships in an input image, which plays a significant role in the marine field. To enhance the ship classification performance, various research focuses on studying multi-modal ship classification, which aims at combining the advantages of visible images and infrared images to capture complementary information. However, the current methods simply concatenate features of different modalities to learn complementary information, which neglects the multi-level correlation between different modalities. Moreover, the existing methods require a large amount of labeled ship images to train the model. How to capture the multi-level cross-modal correlation between unlabeled and labeled data is still a challenge. In this paper, a novel semi-supervised multi-modal ship classification approach is proposed to solve these issues, which consists of two components, i.e., multi-level cross-modal interactive network and semi-supervised contrastive learning strategy. To learn comprehensive complementary information for classification, the multi-level cross-modal interactive network is designed to build local-level and global-level cross-modal feature correlation. Then, the semi-supervised contrastive learning strategy is employed to drive the optimization of the network with the intra-class consistency constraint based on supervision signals of unlabeled samples and prior label information. Extensive experiments on the public datasets demonstrate that our approach achieves state-of-the-art semi-supervised classification effectiveness.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching
    Liang, Jingjun
    Li, Ruichen
    Jin, Qin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2852 - 2861
  • [2] Multi-Modal Curriculum Learning for Semi-Supervised Image Classification
    Gong, Chen
    Tao, Dacheng
    Maybank, Stephen J.
    Liu, Wei
    Kang, Guoliang
    Yang, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3249 - 3260
  • [3] Semi-Supervised Multi-Modal Clustering and Classification with Incomplete Modalities
    Yang, Yang
    Zhan, De-Chuan
    Wu, Yi-Feng
    Liu, Zhi-Bin
    Xiong, Hui
    Jiang, Yuan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (02) : 682 - 695
  • [4] Multi-Modal Sentiment Classification With Independent and Interactive Knowledge via Semi-Supervised Learning
    Zhang, Dong
    Li, Shoushan
    Zhu, Qiaoming
    Zhou, Guodong
    IEEE ACCESS, 2020, 8 : 22945 - 22954
  • [5] Cross-Modal Retrieval Augmentation for Multi-Modal Classification
    Gur, Shir
    Neverova, Natalia
    Stauffer, Chris
    Lim, Ser-Nam
    Kiela, Douwe
    Reiter, Austin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 111 - 123
  • [6] Cross-modal attention network for retinal disease classification based on multi-modal images
    Liu, Zirong
    Hu, Yan
    Qiu, Zhongxi
    Niu, Yanyan
    Zhou, Dan
    Li, Xiaoling
    Shen, Junyong
    Jiang, Hongyang
    Li, Heng
    Liu, Jiang
    BIOMEDICAL OPTICS EXPRESS, 2024, 15 (06): : 3699 - 3714
  • [7] Comprehensive Semi-Supervised Multi-Modal Learning
    Yang, Yang
    Wang, Ke-Tao
    Zhan, De-Chuan
    Xiong, Hui
    Jiang, Yuan
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4092 - 4098
  • [8] MBIAN: Multi-level bilateral interactive attention network for multi-modal
    Sun, Kai
    Zhang, Jiangshe
    Wang, Jialin
    Xu, Shuang
    Zhang, Chunxia
    Hu, Junying
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [9] Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs
    Liang, Bin
    Lou, Chenwei
    Li, Xiang
    Gui, Lin
    Yang, Min
    Xu, Ruifeng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4707 - 4715
  • [10] A semi-supervised cross-modal memory bank for cross-modal retrieval
    Huang, Yingying
    Hu, Bingliang
    Zhang, Yipeng
    Gao, Chi
    Wang, Quan
    NEUROCOMPUTING, 2024, 579