MT-ASM: a multi-task attention strengthening model for fine-grained object recognition

被引:0
|
作者
Liu, Dichao [1 ,2 ,5 ]
Wang, Yu [3 ]
Mase, Kenji [2 ]
Kato, Jien [4 ]
机构
[1] Navier Inc, Chiyoda Ku, Tokyo 1020084, Japan
[2] Nagoya Univ, Grad Sch Informat, Chikusa Ku, Nagoya, Aichi 4648601, Japan
[3] Hitotsubashi Univ, Ctr Informat & Commun Technol, Informat Syst Management Headquarters, 2 chome 1 Naka, Kunitachi, Tokyo 1868601, Japan
[4] Kochi Univ Technol, Sch Data & Innovat, 185 Miyanokuchi, Kami, Kochi 7828502, Japan
[5] Tokyo Inst Technol, Inst Innovat Res, Nagatsuta Cho 4259,Midori Ku, Yokohama, Kanagawa 2268503, Japan
关键词
Fine-grained recognition; Multi-task learning; Contrastive learning; VIDEO;
D O I
10.1007/s00530-024-01446-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-Grained Object Recognition (FGOR) equips intelligent systems with recognition capabilities at or even beyond the level of human experts, making it a core technology for numerous applications such as biodiversity monitoring systems and advanced driver assistance systems. FGOR is highly challenging, and recent research has primarily focused on identifying discriminative regions to tackle this task. However, these methods often require extensive manual labor or expensive algorithms, which may lead to irreversible information loss and pose significant barriers to their practical application. Instead of learning region capturing, this work enhances networks' response to discriminative regions. We propose a multitask attention-strengthening model (MT-ASM), inspired by the human ability to effectively utilize experiences from related tasks when solving a specific task. When faced with an FGOR task, humans naturally compare images from the same and different categories to identify discriminative and non-discriminative regions. MT-ASM employs two networks during the training phase: the major network, tasked with the main goal of category classification, and a subordinate task that involves comparing images from the same and different categories to find discriminative and non-discriminative regions. The subordinate network evaluates the major network's performance on the subordinate task, compelling the major network to improve its subordinate task performance. Once training is complete, the subordinate network is removed, ensuring no additional overhead during inference. Experimental results on CUB-200-2011, Stanford Cars, and FGVC-Aircraft datasets demonstrate that MT-ASM significantly outperforms baseline methods. Given its simplicity and low overhead, it remains highly competitive with state-of-the-art methods. The code is available at https://github.com/Dichao-Liu/Find-Attention-with-Comparison.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Multi-Task Attribute-Fusion Model for Fine-grained Image Recognition
    Li, Mengze
    Kong, Ming
    Kuang, Kun
    Zhu, Qiang
    Wu, Fei
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VII, 2020, 11550
  • [2] Integrating fine-grained attention into multi-task learning for knowledge tracing
    Liangliang He
    Xiao Li
    Pancheng Wang
    Jintao Tang
    Ting Wang
    World Wide Web, 2023, 26 : 3347 - 3372
  • [3] Integrating fine-grained attention into multi-task learning for knowledge tracing
    He, Liangliang
    Li, Xiao
    Wang, Pancheng
    Tang, Jintao
    Wang, Ting
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 3347 - 3372
  • [4] Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach
    Gebru, Timnit
    Hoffman, Judy
    Li Fei-Fei
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1358 - 1367
  • [5] ATTENTION-BASED MULTI-TASK LEARNING FOR FINE-GRAINED IMAGE CLASSIFICATION
    Liu, Dichao
    Wang, Yu
    Mase, Kenji
    Kato, Jien
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1499 - 1503
  • [6] Bidirectional Attention-Recognition Model for Fine-Grained Object Classification
    Liu, Chuanbin
    Xie, Hongtao
    Zha, Zhengjun
    Yu, Lingyun
    Chen, Zhineng
    Zhang, Yongdong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1785 - 1795
  • [7] A Viewpoint Aware Multi-Task Learning Framework for Fine-Grained Vehicle Recognition
    Chen, Qianqiu
    Liu, Wei
    Yu, Xiaoxia
    IEEE ACCESS, 2020, 8 : 171912 - 171923
  • [8] ADAPTIVE MULTI-TASK LEARNING FOR FINE-GRAINED CATEGORIZATION
    Sun, Gang
    Chen, Yanyun
    Liu, Xuehui
    Wu, Enhua
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 996 - 1000
  • [9] MT-FiST: A Multi-Task Fine-Grained Spatial-Temporal Framework for Surgical Action Triplet Recognition
    Li, Yuchong
    Xia, Tong
    Luo, Huoling
    He, Baochun
    Jia, Fucang
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (10) : 4983 - 4994
  • [10] Multi-Task Multi-Head Attention Memory Network for Fine-Grained Sentiment Analysis
    Dai, Zehui
    Dai, Wei
    Liu, Zhenhua
    Rao, Fengyun
    Chen, Huajie
    Zhang, Guangpeng
    Ding, Yadong
    Liu, Jiyang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 609 - 620