Enhancing Few-Shot Image Classification With Cosine Transformer

被引:7
|
作者
Nguyen, Quang-Huy [1 ,2 ]
Nguyen, Cuong Q. [1 ,3 ]
Le, Dung D. D. [2 ]
Pham, Hieu H. [1 ,2 ,4 ]
机构
[1] VinUniv, VinUni Illinois Smart Hlth Ctr, Hanoi 100000, Vietnam
[2] VinUniv, Coll Engn & Comp Sci, Hanoi 100000, Vietnam
[3] Vietnam Natl Univ Ho Chi Minh City, Univ Informat Technol, Comp Sci Dept, Ho Chi Minh City 700000, Vietnam
[4] Univ Illinois Urbana Champaign UIUC, Coordinated Sci Lab, Champaign, IL 61820 USA
关键词
Few-shot learning; image classification; transformer; cross-attention; cosine similarity;
D O I
10.1109/ACCESS.2023.3298299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini -ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer.
引用
收藏
页码:79659 / 79672
页数:14
相关论文
共 50 条
  • [1] Adaptive feature recalibration transformer for enhancing few-shot image classification
    Song, Wei
    Huang, Yaobin
    VISUAL COMPUTER, 2025,
  • [2] Enhancing Few-Shot Image Classification with Unlabelled Examples
    Bateni, Peyman
    Barber, Jarred
    van de Meent, Jan-Willem
    Wood, Frank
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1597 - 1606
  • [3] GRID-TRANSFORMER FOR FEW-SHOT HYPERSPECTRAL IMAGE CLASSIFICATION
    Guo, Ying
    He, Mingyi
    Fan, Bin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 755 - 759
  • [4] Deep transformer and few-shot learning for hyperspectral image classification
    Ran, Qiong
    Zhou, Yonghao
    Hong, Danfeng
    Bi, Meiqiao
    Ni, Li
    Li, Xuan
    Ahmad, Muhammad
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (04) : 1323 - 1336
  • [5] Few-Shot Image Classification Based on Swin Transformer + CSAM + EMD
    Sun, Huadong
    Zhang, Pengyi
    Zhang, Xu
    Han, Xiaowei
    ELECTRONICS, 2024, 13 (11)
  • [6] A Survey of Transformer-Based Few-Shot Image Classification Techniques
    Song, Chaoqi
    Liu, Ying
    He, Jinglu
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 599 - 608
  • [7] Only Image Cosine Embedding for Few-Shot Learning
    Gao, Songyi
    Shen, Weijie
    Liu, Zelin
    Zhu, An
    Yu, Yang
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 83 - 94
  • [8] Quantum Few-Shot Image Classification
    Huang, Zhihao
    Shi, Jinjing
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (01) : 194 - 206
  • [9] RGTransformer: Region-Graph Transformer for Image Representation and Few-Shot Classification
    Jiang, Bo
    Zhao, Kangkang
    Tang, Jin
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 792 - 796
  • [10] SPFormer: Self-Pooling Transformer for Few-Shot Hyperspectral Image Classification
    Li, Ziyu
    Xue, Zhaohui
    Xu, Qi
    Zhang, Ling
    Zhu, Tianzhi
    Zhang, Mengxue
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 19