Appearance Label Balanced Triplet Loss for Multi-modal Aerial View Object Classification

被引:2
|
作者
Puttagunta, Raghunath Sai [1 ]
Li, Zhu [1 ]
Bhattacharyya, Shuvra [2 ]
York, George [3 ]
机构
[1] Univ Missouri, Kansas City, MO 64110 USA
[2] Univ Maryland, College Pk, MD USA
[3] US Air Force Acad, Colorado Springs, CO USA
关键词
LONG-TAILED RECOGNITION; NETWORK;
D O I
10.1109/CVPRW59228.2023.00060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic target recognition (ATR) using image data is an important computer vision task with widespread applications in remote sensing for surveillance, object tracking, urban planning, agriculture, and more. Although there have been continuous advancements in this task, there is still significant room for further advancements, particularly with aerial images. This work extracts rich information from multimodal synthetic aperture radar (SAR) and electro-optical (EO) aerial images to perform object classification. Compared to EO images, the advantages of SAR images are that they can be captured at night and in any weather condition. Compared to EO images, the disadvantage of SAR images is that they are noisy. Overcoming the noise inherent to SAR images is a challenging, but worthwhile, task because of the additional information SAR images provide the model. This work proposes a training strategy that involves the creation of appearance labels to generate triplet pairs for training the network with both triplet loss and cross-entropy loss. During the development phase of the 2023 Perception Beyond Visual Spectrum (PBVS) Multi-modal Aerial Image Object Classification (MAVOC) challenge, our ResNet-34 model achieved a top-1 accuracy of 64.29% for Track 1 and our ensemble learning model achieved a top-1 accuracy 75.84% for Track 2. These values are 542% and 247% higher than the baseline values. Overall, this work ranked 3rd in both Track 1 and Track 2.
引用
收藏
页码:534 / 542
页数:9
相关论文
共 50 条
  • [21] Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer
    He, Sunan
    Guo, Taian
    Dai, Tao
    Qiao, Ruizhi
    Shu, Xiujun
    Ren, Bo
    Xia, Shu-Tao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 808 - 816
  • [22] Multi-modal, Multi-task and Multi-label for Music Genre Classification and Emotion Regression
    Pandeya, Yagya Raj
    You, Jie
    Bhattarai, Bhuwan
    Lee, Joonwhoan
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1042 - 1045
  • [23] MULTI-MODAL APPROACH TO INDEXING AND CLASSIFICATION
    SWIFT, DF
    WINN, VA
    BRAMER, DA
    INTERNATIONAL CLASSIFICATION, 1977, 4 (02): : 90 - 94
  • [24] Multi-modal Semantic Place Classification
    Pronobis, A.
    Mozos, O. Martinez
    Caputo, B.
    Jensfelt, P.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (2-3): : 298 - 320
  • [25] Multi-modal long document classification based on Hierarchical Prompt and Multi-modal Transformer
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Wang, Jiapu
    Sun, Yanfeng
    Yin, Baocai
    NEURAL NETWORKS, 2024, 176
  • [26] Large Margin Multi-Modal Triplet Metric Learning
    Di, Xing
    Patel, Vishal M.
    2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, : 370 - 377
  • [27] A Multi-modal Approach for Enhancing Object Placement
    Srimal, P. H. D. Arjuna S.
    Jayasekara, A. G. Buddhika P.
    PROCEEDINGS OF THE 2017 6TH NATIONAL CONFERENCE ON TECHNOLOGY & MANAGEMENT (NCTM) - EXCEL IN RESEARCH AND BUILD THE NATION, 2017, : 17 - 22
  • [28] Efficient multi-modal hypergraph learning for social image classification with complex label correlations
    Wang, Leiquan
    Zhao, Zhicheng
    Su, Fei
    NEUROCOMPUTING, 2016, 171 : 242 - 251
  • [29] Multi-modal Queried Object Detection in the Wild
    Xu, Yifan
    Zhang, Mengdan
    Fu, Chaoyou
    Chen, Peixian
    Yang, Xiaoshan
    Li, Ke
    Xu, Changsheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [30] Deep Object Tracking with Multi-modal Data
    Zhang, Xuezhi
    Yuan, Yuan
    Lu, Xiaoqiang
    2016 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS (CITS), 2016, : 161 - 165