MT-ASM: a multi-task attention strengthening model for fine-grained object recognition

被引：0

作者：

Liu, Dichao ^{[1
,2
,5
]}

Wang, Yu ^{[3
]}

Mase, Kenji ^{[2
]}

Kato, Jien ^{[4
]}

机构：

[1] Navier Inc, Chiyoda Ku, Tokyo 1020084, Japan

[2] Nagoya Univ, Grad Sch Informat, Chikusa Ku, Nagoya, Aichi 4648601, Japan

[3] Hitotsubashi Univ, Ctr Informat & Commun Technol, Informat Syst Management Headquarters, 2 chome 1 Naka, Kunitachi, Tokyo 1868601, Japan

[4] Kochi Univ Technol, Sch Data & Innovat, 185 Miyanokuchi, Kami, Kochi 7828502, Japan

[5] Tokyo Inst Technol, Inst Innovat Res, Nagatsuta Cho 4259,Midori Ku, Yokohama, Kanagawa 2268503, Japan

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 05期

关键词：

Fine-grained recognition; Multi-task learning; Contrastive learning; VIDEO;

D O I：

10.1007/s00530-024-01446-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Fine-Grained Object Recognition (FGOR) equips intelligent systems with recognition capabilities at or even beyond the level of human experts, making it a core technology for numerous applications such as biodiversity monitoring systems and advanced driver assistance systems. FGOR is highly challenging, and recent research has primarily focused on identifying discriminative regions to tackle this task. However, these methods often require extensive manual labor or expensive algorithms, which may lead to irreversible information loss and pose significant barriers to their practical application. Instead of learning region capturing, this work enhances networks' response to discriminative regions. We propose a multitask attention-strengthening model (MT-ASM), inspired by the human ability to effectively utilize experiences from related tasks when solving a specific task. When faced with an FGOR task, humans naturally compare images from the same and different categories to identify discriminative and non-discriminative regions. MT-ASM employs two networks during the training phase: the major network, tasked with the main goal of category classification, and a subordinate task that involves comparing images from the same and different categories to find discriminative and non-discriminative regions. The subordinate network evaluates the major network's performance on the subordinate task, compelling the major network to improve its subordinate task performance. Once training is complete, the subordinate network is removed, ensuring no additional overhead during inference. Experimental results on CUB-200-2011, Stanford Cars, and FGVC-Aircraft datasets demonstrate that MT-ASM significantly outperforms baseline methods. Given its simplicity and low overhead, it remains highly competitive with state-of-the-art methods. The code is available at https://github.com/Dichao-Liu/Find-Attention-with-Comparison.

引用

页数：16

共 50 条

[1] Multi-Task Attribute-Fusion Model for Fine-grained Image Recognition
Li, Mengze
Kong, Ming
Kuang, Kun
Zhu, Qiang
Wu, Fei
OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VII, 2020, 11550
[2] Integrating fine-grained attention into multi-task learning for knowledge tracing
Liangliang He
Xiao Li
Pancheng Wang
Jintao Tang
Ting Wang
World Wide Web, 2023, 26 : 3347 - 3372
[3] Integrating fine-grained attention into multi-task learning for knowledge tracing
He, Liangliang
Li, Xiao
Wang, Pancheng
Tang, Jintao
Wang, Ting
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 3347 - 3372
[4] Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach
Gebru, Timnit
Hoffman, Judy
Li Fei-Fei
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1358 - 1367
[5] ATTENTION-BASED MULTI-TASK LEARNING FOR FINE-GRAINED IMAGE CLASSIFICATION
Liu, Dichao
Wang, Yu
Mase, Kenji
Kato, Jien
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1499 - 1503
[6] Bidirectional Attention-Recognition Model for Fine-Grained Object Classification
Liu, Chuanbin
Xie, Hongtao
Zha, Zhengjun
Yu, Lingyun
Chen, Zhineng
Zhang, Yongdong
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1785 - 1795
[7] A Viewpoint Aware Multi-Task Learning Framework for Fine-Grained Vehicle Recognition
Chen, Qianqiu
Liu, Wei
Yu, Xiaoxia
IEEE ACCESS, 2020, 8 : 171912 - 171923
[8] ADAPTIVE MULTI-TASK LEARNING FOR FINE-GRAINED CATEGORIZATION
Sun, Gang
Chen, Yanyun
Liu, Xuehui
Wu, Enhua
2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 996 - 1000
[9] MT-FiST: A Multi-Task Fine-Grained Spatial-Temporal Framework for Surgical Action Triplet Recognition
Li, Yuchong
Xia, Tong
Luo, Huoling
He, Baochun
Jia, Fucang
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (10) : 4983 - 4994
[10] Multi-Task Multi-Head Attention Memory Network for Fine-Grained Sentiment Analysis
Dai, Zehui
Dai, Wei
Liu, Zhenhua
Rao, Fengyun
Chen, Huajie
Zhang, Guangpeng
Ding, Yadong
Liu, Jiyang
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 609 - 620

← 1 2 3 4 5 →