Combining simulated data, foundation models, and few real samples for training fine-grained object detectors

被引：0

作者：

Heslinga, Friso G. ^{[1
]}

Eker, Thijs A. ^{[1
]}

Fokkinga, Ella P. ^{[1
]}

van Woerden, Jan Erik ^{[1
]}

Ruis, Frank A. ^{[1
]}

den Hollander, Richard J. M. ^{[1
]}

Schutte, Klamer ^{[1
]}

机构：

[1] TNO, Intelligent Imaging, Oude Waalsdorperweg 63, The Hague, Netherlands

来源：

SYNTHETIC DATA FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING: TOOLS, TECHNIQUES, AND APPLICATIONS II | 2024年 / 13035卷

关键词：

Deep learning; Simulated data; Object detection; Foundation model; Military vehicles;

D O I：

10.1117/12.3013375

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic object detection is increasingly important in the military domain, with potential applications including target identification, threat assessment, and strategic decision-making processes. Deep learning has become the standard methodology for developing object detectors, but obtaining the necessary large set of training images can be challenging due to the restricted nature of military data. Moreover, for meaningful deployment of an object detection model, it needs to work in various environments and conditions, in which prior data acquisition might not be possible. The use of simulated data for model development can be an alternative for real images and recent work has shown the potential for training a military vehicle detector using simulated data. Nevertheless, fine-grained classification of detected military vehicles, using training on simulated data, remains an open challenge. In this study, we develop an object detector for 15 vehicle classes, containing similar appearing types, such as multiple battle tanks and howitzers. We show that combining few real data samples with a large amount of simulated data (12,000 images) leads to a significant improvement in comparison with using one of these sources individually. Adding just two samples per class improves the mAP to 55.9 [+/- 2.6], compared to 33.8 [+/- 0.7] when only simulated data is used. Further improvements are achieved by adding more real samples and using Grounding DINO, a foundation model pretrained on vast amounts of data (mAP = 90.1 [+/- 0.5]). In addition, we investigate the effect of simulation variation, which we find is important even when more real samples are available.

引用

页数：12

共 50 条

[1] Distilling Object Detectors With Fine-Grained Feature Imitation
Wang, Tao
Yuan, Li
Zhang, Xiaopeng
Feng, Jiashi
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4928 - 4937
[2] Saliency for fine-grained object recognition in domains with scarce training data
Figueroa Flores, Carola
Gonzalez-Garcia, Abel
van de Weijer, Joost
Raducanu, Bogdan
PATTERN RECOGNITION, 2019, 94 : 62 - 73
[3] Fine-grained few shot learning with foreground object transformation
Wang, Chaofei
Song, Shiji
Yang, Qisen
Li, Xiang
Huang, Gao
NEUROCOMPUTING, 2021, 466 : 16 - 26
[4] Fine-Grained Prototypes Distillation for Few-Shot Object Detection
Wang, Zichen
Yang, Bo
Yue, Haonan
Ma, Zhenghao
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5859 - 5866
[5] Fine-grained object recognition in underwater visual data
Spampinato, C.
Palazzo, S.
Joalland, P. H.
Paris, S.
Glotin, H.
Blanc, K.
Lingrand, D.
Precioso, F.
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (03) : 1701 - 1720
[6] Fine-grained object recognition in underwater visual data
C. Spampinato
S. Palazzo
P. H. Joalland
S. Paris
H. Glotin
K. Blanc
D. Lingrand
F. Precioso
Multimedia Tools and Applications, 2016, 75 : 1701 - 1720
[7] Fine-Grained Real Estate Estimation Based on Mixture Models
Ji, Peng
Xin, Xin
Guo, Ping
ADVANCES IN NEURAL NETWORKS - ISNN 2016, 2016, 9719 : 555 - 564
[8] Improving Fine-Grained Object Classification Using Adversarial Generated Unlabeled Samples
Xie, Enze
Li, Guangyao
Liu, Wenyu
2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
[9] Fine-Grained Object Detection Using Transfer Learning and Data Augmentation
Dalal, Rahul
Moh, Teng-Sheng
2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 893 - 896
[10] Fine-grained recognition of thousands of object categories with single-example training
Karlinsky, Leonid
Shtok, Joseph
Tzur, Yochay
Tzadok, Asaf
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 965 - 974

← 1 2 3 4 5 →