I read, I saw, I tell: Texts Assisted Fine-Grained Visual Classification

被引：23

作者：

Li, Jingjing ^{[1
]}

Zhu, Lei ^{[2
]}

Huang, Zi ^{[3
]}

Lu, Ke ^{[1
]}

Zhao, Jidong ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China

[2] Shandong Normal Univ, Jinan, Peoples R China

[3] Univ Queensland, Sch ITEE, Brisbane, Qld, Australia

来源：

PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18) | 2018年

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Fine-grained visual classification; multi-modal analysis; deep learning; transfer learning;

D O I：

10.1145/3240508.3240579

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In visual classification tasks, it is hard to tell the subtle differences from one species to another similar breeds. Such a challenging problem is generally known as Fine-Grained Visual Classification (FGVC). In this paper, we propose a novel FGVC approach called Texts Assisted Fine-Grained Visual Classification (TA-FGVC). TA-FGVC reads from texts to gain attention, sees the images with the gained attention and then tells the subtle differences. Technically, we propose a deep neural network which learns a visual-semantic embedding model. The proposed deep architecture mainly consists of two parts: visual localization and visual-to-semantic projection. The model is fed with both visual features which are extracted from raw images and semantic information which are learned from two sources: gleaned from unannotated texts and gathered from image attributes. At the very last layer of the model, each image is embedded into the semantic space which is related to class labels. Finally, the categorization results from both visual stream and visual-semantic stream are combined to achieve the ultimate decision. Extensive experiments on open standard benchmarks verify the superiority of our model against several state of the art work.

引用

页码：663 / 671

页数：9

共 50 条

[21] DriverGuard: Virtualization-Based Fine-Grained Protection on I/O Flows
Cheng, Yueqiang
Ding, Xuhua
Deng, Robert H.
ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2013, 16 (02)
[22] Directional response of a reconstituted fine-grained soil - Part I: Experimental investigation
Costanzo, Daniele
Viggiani, Gioacchino
Tamagnini, Claudio
INTERNATIONAL JOURNAL FOR NUMERICAL AND ANALYTICAL METHODS IN GEOMECHANICS, 2006, 30 (13) : 1283 - 1301
[23] Electromagnetic properties of the ground: Part I - Fine-grained soils at the Liquid Limit
Thomas, A. M.
Chapman, D. N.
Rogers, C. D. F.
Metje, N.
TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2010, 25 (06) : 714 - 722
[24] Dual Transformer With Multi-Grained Assembly for Fine-Grained Visual Classification
Ji, Ruyi
Li, Jiaying
Zhang, Libo
Liu, Jing
Wu, Yanjun
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5009 - 5021
[25] A Progressive Gated Attention Model for Fine-Grained Visual Classification
Zhu, Qiangxi
Li, Zhixin
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2063 - 2068
[26] Learning Hierarchal Channel Attention for Fine-grained Visual Classification
Guan, Xiang
Wang, Guoqing
Xu, Xing
Bin, Yi
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5011 - 5019
[27] Hierarchical attention vision transformer for fine-grained visual classification
Hu, Xiaobin
Zhu, Shining
Peng, Taile
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 91
[28] Using Coarse Label Constraint for Fine-Grained Visual Classification
Lu, Chaohao
Zou, Yuexian
MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 266 - 277
[29] A collaborative gated attention network for fine-grained visual classification
Zhu, Qiangxi
Kuang, Wenlan
Li, Zhixin
DISPLAYS, 2023, 79
[30] Symmetrical irregular local features for fine-grained visual classification
Yang, Ming
Xu, Yang
Wu, Zebin
Wei, Zhihui
NEUROCOMPUTING, 2022, 505 : 304 - 314

← 1 2 3 4 5 →