SAUI: Scale-Aware Unseen Imagineer for Zero-Shot Object Detection

被引：0

作者：

Wang, Jiahao ^{[1
]}

Yan, Caixia ^{[1
]}

Zhang, Weizhan ^{[1
]}

Liu, Huan ^{[1
]}

Sun, Hao ^{[2
]}

Zheng, Qinghua ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, MOEKLINNS Lab, Sch Comp Sci & Technol, Xian, Peoples R China

[2] China Telecom Artificial Intelligence Technol Co, Hong Kong, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6 | 2024年

基金：

国家重点研发计划; 中国博士后科学基金; 中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Zero-shot object detection (ZSD) aims to localize and classify unseen objects without access to their training annotations. As a prevailing solution to ZSD, generation-based methods synthesize unseen visual features by taking seen features as reference and class semantic embeddings as guideline. Although previous works continuously improve the synthesis quality, they fail to consider the scale-varying nature of unseen objects. The generation process is preformed over a single scale of object features and thus lacks scale-diversity among synthesized features. In this paper, we reveal the scale-varying challenge in ZSD and propose a Scale-Aware Unseen Imagineer (SAUI) to lead the way of a novel scale-aware ZSD paradigm. To obtain multi-scale features of seen-class objects, we design a specialized coarse-to-fine extractor to capture features through multiple scale-views. To generate unseen features scale by scale, we innovate a Series-GAN synthesizer along with three scale-aware contrastive components to imagine separable, diverse and robust scale-wise unseen features. Extensive experiments on PASCAL VOC, COCO and DIOR datasets demonstrate SAUI's better performance in different scenarios, especially for scale-varying and small objects. Notably, SAUI achieves the new state-of-the-art performance on COCO and DIOR.

引用

页码：5445 / 5453

页数：9

共 50 条

[41] Scale-Aware Face Detection
Hao, Zekun
Liu, Yu
Qin, Hongwei
Yan, Junjie
Li, Xiu
Hu, Xiaolin
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1913 - 1922
[42] Improved Visual-Semantic Alignment for Zero-Shot Object Detection
Rahman, Shafin
Khan, Salman
Barnes, Nick
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11932 - 11939
[43] Zero-Shot Object Detection With Attributes-Based Category Similarity
Mao, Qiaomei
Wang, Chong
Yu, Shenghao
Zheng, Ye
Li, Yuqi
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (05) : 921 - 925
[44] Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts
Shafin Rahman
Salman H. Khan
Fatih Porikli
International Journal of Computer Vision, 2020, 128 : 2979 - 2999
[45] Zero-Shot Object Detection and Segmentation: A Focus on Street View Imagery
Tilki, Sahra
Kaplan, Ahmet
Zengin, Aydin Tarik
2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
[46] Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts
Rahman, Shafin
Khan, Salman H.
Porikli, Fatih
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (12) : 2979 - 2999
[47] Semantics-Preserving Graph Propagation for Zero-Shot Object Detection
Yan, Caixia
Zheng, Qinghua
Chang, Xiaojun
Luo, Minnan
Yeh, Chung-Hsing
Hauptman, Alexander G.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8163 - 8176
[48] Distinguishing Unseen from Seen for Generalized Zero-shot Learning
Su, Hongzu
Li, Jingjing
Chen, Zhi
Zhu, Lei
Lu, Ke
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7875 - 7884
[49] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning
Changpinyo, Soravit
Chao, Wei-Lun
Sha, Fei
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3496 - 3505
[50] Zero-Shot Unseen Speaker Anonymization via Voice Conversion
Chang, Hyung-Pil
Yoo, In-Chul
Jeong, Changhyeon
Yook, Dongsuk
IEEE ACCESS, 2022, 10 : 130190 - 130199

← 1 2 3 4 5 →