Multi-label remote sensing classification with self-supervised gated multi-modal transformers

被引:1
|
作者
Liu, Na [1 ]
Yuan, Ye [1 ]
Wu, Guodong [2 ]
Zhang, Sai [2 ]
Leng, Jie [2 ]
Wan, Lihong [2 ]
机构
[1] Univ Shanghai Sci & Technol, Inst Machine Intelligence, Shanghai, Peoples R China
[2] Origin Dynam Intelligent Robot Co Ltd, Zhengzhou, Peoples R China
关键词
self-supervised learning; pre-training; vision transformer; multi-modal; gated units; BENCHMARK-ARCHIVE; LARGE-SCALE; BIGEARTHNET;
D O I
10.3389/fncom.2024.1404623
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Introduction With the great success of Transformers in the field of machine learning, it is also gradually attracting widespread interest in the field of remote sensing (RS). However, the research in the field of remote sensing has been hampered by the lack of large labeled data sets and the inconsistency of data modes caused by the diversity of RS platforms. With the rise of self-supervised learning (SSL) algorithms in recent years, RS researchers began to pay attention to the application of "pre-training and fine-tuning" paradigm in RS. However, there are few researches on multi-modal data fusion in remote sensing field. Most of them choose to use only one of the modal data or simply splice multiple modal data roughly.Method In order to study a more efficient multi-modal data fusion scheme, we propose a multi-modal fusion mechanism based on gated unit control (MGSViT). In this paper, we pretrain the ViT model based on BigEarthNet dataset by combining two commonly used SSL algorithms, and propose an intra-modal and inter-modal gated fusion unit for feature learning by combining multispectral (MS) and synthetic aperture radar (SAR). Our method can effectively combine different modal data to extract key feature information.Results and discussion After fine-tuning and comparison experiments, we outperform the most advanced algorithms in all downstream classification tasks. The validity of our proposed method is verified.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Multi-modal bilinear fusion with hybrid attention mechanism for multi-label skin lesion classification
    Wei, Yun
    Ji, Lin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65221 - 65247
  • [42] Micro-video multi-label classification method based on multi-modal feature encoding
    Jing P.
    Li Y.
    Su Y.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (04): : 109 - 117
  • [43] Label enhancement for multi-label classification of remote sensing images with missing labels
    Huang, Rui
    Ou, Hanzhi
    Huang, Wei
    REMOTE SENSING LETTERS, 2025, 16 (02) : 170 - 180
  • [44] Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification
    Vaizman, Yonatan
    Weibel, Nadir
    Lanckriet, Gert
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2017, 1 (04)
  • [45] Multi-Label Remote Sensing Scene Classification Using Multi-Bag Integration
    Wang, Xin
    Xiong, Xingnan
    Ning, Chen
    IEEE ACCESS, 2019, 7 : 120399 - 120410
  • [46] Supervised topic models for multi-label classification
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    NEUROCOMPUTING, 2015, 149 : 811 - 819
  • [47] Supervised representation learning for multi-label classification
    Ming Huang
    Fuzhen Zhuang
    Xiao Zhang
    Xiang Ao
    Zhengyu Niu
    Min-Ling Zhang
    Qing He
    Machine Learning, 2019, 108 : 747 - 763
  • [48] Supervised representation learning for multi-label classification
    Huang, Ming
    Zhuang, Fuzhen
    Zhang, Xiao
    Ao, Xiang
    Niu, Zhengyu
    Zhang, Min-Ling
    He, Qing
    MACHINE LEARNING, 2019, 108 (05) : 747 - 763
  • [49] MULTI-MODAL SELF-SUPERVISED LEARNING FOR BOOSTING CROP CLASSIFICATION USING SENTINEL2 AND PLANETSCOPE
    Patnala, Ankit
    Stadtler, Scarlet
    Schultz, Martin G.
    Gall, Juergen
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 2223 - 2226
  • [50] Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
    Yang, Yang
    Wu, Yi-Feng
    Zhan, De-Chuan
    Liu, Zhi-Bin
    Jiang, Yuan
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2594 - 2603