Enhancing Few-Shot Image Classification With Cosine Transformer

被引:7
|
作者
Nguyen, Quang-Huy [1 ,2 ]
Nguyen, Cuong Q. [1 ,3 ]
Le, Dung D. D. [2 ]
Pham, Hieu H. [1 ,2 ,4 ]
机构
[1] VinUniv, VinUni Illinois Smart Hlth Ctr, Hanoi 100000, Vietnam
[2] VinUniv, Coll Engn & Comp Sci, Hanoi 100000, Vietnam
[3] Vietnam Natl Univ Ho Chi Minh City, Univ Informat Technol, Comp Sci Dept, Ho Chi Minh City 700000, Vietnam
[4] Univ Illinois Urbana Champaign UIUC, Coordinated Sci Lab, Champaign, IL 61820 USA
关键词
Few-shot learning; image classification; transformer; cross-attention; cosine similarity;
D O I
10.1109/ACCESS.2023.3298299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini -ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer.
引用
收藏
页码:79659 / 79672
页数:14
相关论文
共 50 条
  • [41] Few-Shot Image Classification via Mutual Distillation
    Zhang, Tianshu
    Dai, Wenwen
    Chen, Zhiyu
    Yang, Sai
    Liu, Fan
    Zheng, Hao
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [42] Disentangled Feature Representation for Few-Shot Image Classification
    Cheng, Hao
    Wang, Yufei
    Li, Haoliang
    Kot, Alex C.
    Wen, Bihan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 10422 - 10435
  • [43] Enhancement of Few-shot Image Classification Using Eigenimages
    Ko, Jonghyun
    Chung, Wonzoo
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (12) : 4088 - 4097
  • [44] Federated Learning and Optimization for Few-Shot Image Classification
    Zuo, Yi
    Chen, Zhenping
    Feng, Jing
    Fan, Yunhao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (03): : 4649 - 4667
  • [45] Learning to Calibrate Prototypes for Few-Shot Image Classification
    Liang, Chenchen
    Jiang, Chenyi
    Wang, Shidong
    Zhang, Haofeng
    COGNITIVE COMPUTATION, 2025, 17 (01)
  • [46] Few-shot learning for skin lesion image classification
    Liu, Xue-Jun
    Li, Kai-li
    Luan, Hai-ying
    Wang, Wen-hui
    Chen, Zhao-yu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 4979 - 4990
  • [47] SCFormer: Spectral Coordinate Transformer for Cross-Domain Few-Shot Hyperspectral Image Classification
    Li, Jiaojiao
    Zhang, Zhiyuan
    Song, Rui
    Li, Yunsong
    Du, Qian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 840 - 855
  • [48] ViTFSL-Baseline: A Simple Baseline of Vision Transformer Network for Few-Shot Image Classification
    Wang, Guangpeng
    Wang, Yongxiong
    Pan, Zhiqun
    Wang, Xiaoming
    Zhang, Jiapeng
    Pan, Jiayun
    IEEE ACCESS, 2024, 12 : 11836 - 11849
  • [49] Improved Few-Shot SAR Image Generation by Enhancing Diversity
    Bao, Jianghan
    Yu, Wen Ming
    Yang, Kaiqiao
    Liu, Che
    Cui, Tie Jun
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 3394 - 3408
  • [50] A review of few-shot classification
    Lim, Jia Min
    Lim, Kian Ming
    Lee, Chin Poo
    Lim, Jit Yan
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 275