SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting

被引:0
|
作者
Zgaren, Ahmed [1 ,2 ]
Bouachir, Wassim [2 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat & Syst Engn CIISE, Montreal, PQ H3G 1M8, Canada
[2] Univ Quebec TELUQ, Data Sci Lab, Montreal, PQ H2S 3L5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
object counting; transformers; visual attention; zero-shot; class-agnostic;
D O I
10.3390/jimaging11020052
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Zero-shot counting is a subcategory of Generic Visual Object Counting, which aims to count objects from an arbitrary class in a given image. While few-shot counting relies on delivering exemplars to the model to count similar class objects, zero-shot counting automates the operation for faster processing. This paper proposes a fully automated zero-shot method outperforming both zero-shot and few-shot methods. By exploiting feature maps from a pre-trained detection-based backbone, we introduce a new Visual Embedding Module designed to generate semantic embeddings within object contextual information. These embeddings are then fed to a Self-Attention Matching Module to generate an encoded representation for the head counter. Our proposed method has outperformed recent zero-shot approaches, achieving the best Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) results of 8.89 and 35.83, respectively, on the FSC147 dataset. Additionally, our method demonstrates competitive performance compared to few-shot methods, advancing the capabilities of visual object counting in various industrial applications such as tree counting, wildlife animal counting, and medical applications like blood cell counting.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Disentangled Ontology Embedding for Zero-shot Learning
    Geng, Yuxia
    Chen, Jiaoyan
    Zhang, Wen
    Xu, Yajing
    Chen, Zhuo
    Pan, Jeff Z.
    Huang, Yufeng
    Xiong, Feiyu
    Chen, Huajun
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 443 - 453
  • [42] Zero-Shot Video Object Segmentation With Co-Attention Siamese Networks
    Lu, Xiankai
    Wang, Wenguan
    Shen, Jianbing
    Crandall, David
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (04) : 2228 - 2242
  • [43] Co-attention Propagation Network for Zero-Shot Video Object Segmentation
    Pei, Gensheng
    Yao, Yazhou
    Shen, Fumin
    Huang, Dan
    Huang, Xingguo
    Shen, Heng-Tao
    arXiv, 2023,
  • [44] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting
    Jiang, Ruixiang
    Liu, Lingbo
    Chen, Changwen
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4535 - 4545
  • [45] Union Embedding and Backbone-Attention boost Zero-Shot Learning Model (UBZSL)
    Li, Ziyu
    2022 IEEE 5TH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING APPLICATIONS AND SYSTEMS, IPAS, 2022,
  • [46] Towards Zero-Shot Learning: A Brief Review and an Attention-Based Embedding Network
    Xie, Guo-Sen
    Zhang, Zheng
    Xiong, Huan
    Shao, Ling
    Li, Xuelong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1181 - 1197
  • [47] FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
    Zhuang, Jiedong
    Hu, Jiaqi
    Mu, Lianrui
    Hu, Rui
    Liang, Xiaoyu
    Ye, Jiangnan
    Hu, Haoji
    COMPUTER VISION - ECCV 2024, PT X, 2025, 15068 : 236 - 253
  • [48] Multi-scale visual attention for attribute disambiguation in zero-shot learning
    Tian, Long
    Chen, Bo
    Ren, Jie
    Zhang, Hao
    Wu, Zhenhua
    Han, Ning
    Chen, Yuanwei
    Liu, Hongwei
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 103
  • [49] DVAMN: Dual Visual Attention Matching Network for Zero-Shot Action Recognition
    Qi, Cheng
    Feng, Zhiyong
    Xing, Meng
    Su, Yong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 564 - 575
  • [50] Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    Liu, Wei
    Chang, Shih-Fu
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1043 - 1052