Enhancing Few-Shot Image Classification With Cosine Transformer

被引：7

作者：

Nguyen, Quang-Huy ^{[1
,2
]}

Nguyen, Cuong Q. ^{[1
,3
]}

Le, Dung D. D. ^{[2
]}

Pham, Hieu H. ^{[1
,2
,4
]}

机构：

[1] VinUniv, VinUni Illinois Smart Hlth Ctr, Hanoi 100000, Vietnam

[2] VinUniv, Coll Engn & Comp Sci, Hanoi 100000, Vietnam

[3] Vietnam Natl Univ Ho Chi Minh City, Univ Informat Technol, Comp Sci Dept, Ho Chi Minh City 700000, Vietnam

[4] Univ Illinois Urbana Champaign UIUC, Coordinated Sci Lab, Champaign, IL 61820 USA

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Few-shot learning; image classification; transformer; cross-attention; cosine similarity;

D O I：

10.1109/ACCESS.2023.3298299

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini -ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer.

引用

页码：79659 / 79672

页数：14

共 50 条

[1] Adaptive feature recalibration transformer for enhancing few-shot image classification
Song, Wei
Huang, Yaobin
VISUAL COMPUTER, 2025,
[2] Enhancing Few-Shot Image Classification with Unlabelled Examples
Bateni, Peyman
Barber, Jarred
van de Meent, Jan-Willem
Wood, Frank
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1597 - 1606
[3] GRID-TRANSFORMER FOR FEW-SHOT HYPERSPECTRAL IMAGE CLASSIFICATION
Guo, Ying
He, Mingyi
Fan, Bin
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 755 - 759
[4] Deep transformer and few-shot learning for hyperspectral image classification
Ran, Qiong
Zhou, Yonghao
Hong, Danfeng
Bi, Meiqiao
Ni, Li
Li, Xuan
Ahmad, Muhammad
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (04) : 1323 - 1336
[5] Few-Shot Image Classification Based on Swin Transformer + CSAM + EMD
Sun, Huadong
Zhang, Pengyi
Zhang, Xu
Han, Xiaowei
ELECTRONICS, 2024, 13 (11)
[6] A Survey of Transformer-Based Few-Shot Image Classification Techniques
Song, Chaoqi
Liu, Ying
He, Jinglu
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 599 - 608
[7] Only Image Cosine Embedding for Few-Shot Learning
Gao, Songyi
Shen, Weijie
Liu, Zelin
Zhu, An
Yu, Yang
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 83 - 94
[8] Quantum Few-Shot Image Classification
Huang, Zhihao
Shi, Jinjing
Li, Xuelong
IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (01) : 194 - 206
[9] RGTransformer: Region-Graph Transformer for Image Representation and Few-Shot Classification
Jiang, Bo
Zhao, Kangkang
Tang, Jin
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 792 - 796
[10] SPFormer: Self-Pooling Transformer for Few-Shot Hyperspectral Image Classification
Li, Ziyu
Xue, Zhaohui
Xu, Qi
Zhang, Ling
Zhu, Tianzhi
Zhang, Mengxue
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 19

← 1 2 3 4 5 →