Writer Retrieval using Compact Convolutional Transformers and NetMVLAD

被引：2

作者：

Peer, Marco ^{[1
]}

Kleber, Florian ^{[1
]}

Sablatnig, Robert ^{[1
]}

机构：

[1] TU Wien, Inst Visual Comp & Human Ctr Technol, Comp Vis Lab, Vienna, Austria

来源：

2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2022年

关键词：

IDENTIFICATION; FEATURES;

D O I：

10.1109/ICPR56361.2022.9956155

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a method for writer retrieval where embeddings of patches extracted at SIFT keypoint locations are learned by a Compact Convolutional Transformer (CCT), a modified attention-based transformer architecture including convolutions, followed by a NetMVLAD layer and Generalized Max Pooling (GMP) to obtain global page descriptors. We introduce the application of CCTs for writer retrieval and show that they outperform Convolutional Neural Networks (CNNs) used in current State-of-the-Art methods for writer retrieval, namely ResNet18, while at the same time only have one-third of the number of parameters. Additionally, we propose NetMVLAD, an extension of NetVLAD with multiple vocabularies, to encode information with different vocabulary sizes improving the original NetVLAD. An evaluation of the performance of CCTs compared to ResNet18 is provided on the ICDAR2013 Competition on Writer Identification dataset (ICDAR2013) and CVL dataset. The effect of multiple vocabularies applied within the NetVLAD layer is shown. CCT7 pretrained on CIFAR100 combined with NetMVLAD achieves 89.3% Mean Average Precision (mAP) on the ICDAR2013 dataset and 96.5% on the CVL dataset.

引用

页码：1571 / 1578

页数：8

共 50 条

[1] Writer Identification and Retrieval Using a Convolutional Neural Network
Fiel, Stefan
Sablatnig, Robert
COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT II, 2015, 9257 : 26 - 37
[2] Self-supervised Vision Transformers for Writer Retrieval
Raven, Tim
Matei, Arthur
Fink, Gernot A.
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 380 - 396
[3] A music recommender system based on compact convolutional transformers
Pourmoazemi, Negar
Maleki, Sepehr
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
[4] Self-supervised Vision Transformers with Data Augmentation Strategies Using Morphological Operations for Writer Retrieval
Peer, Marco
Kleber, Florian
Sablatnig, Robert
FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 122 - 136
[5] Writer Identification and Writer Retrieval Using Vision Transformer for Forensic Documents
Koepf, Michael
Kleber, Florian
Sablatnig, Robert
DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 352 - 366
[6] Writer Identification and Writer Retrieval using the Fisher Vector on Visual Vocabularies
Fiel, Stefan
Sablatnig, Robert
2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 545 - 549
[7] Deepfake detection using convolutional vision transformers and convolutional neural networks
Soudy, Ahmed Hatem
Sayed, Omnia
Tag-Elser, Hala
Ragab, Rewaa
Mohsen, Sohaila
Mostafa, Tarek
Abohany, Amr A.
Slim, Salwa O.
Neural Computing and Applications, 2024, 36 (31) : 19759 - 19775
[8] Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network
Bui, T.
Ribeiro, L.
Ponti, M.
Collomosse, J.
COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 164 : 27 - 37
[9] Content-based image retrieval with compact deep convolutional features
Alzu'bi, Ahmad
Amira, Abbes
Ramzan, Naeem
NEUROCOMPUTING, 2017, 249 : 95 - 105
[10] CCTCOVID: COVID-19 detection from chest X-ray images using Compact Convolutional Transformers
Marefat, Abdolreza
Marefat, Mahdieh
Hassannataj Joloudari, Javad
Nematollahi, Mohammad Ali
Lashgari, Reza
FRONTIERS IN PUBLIC HEALTH, 2023, 11

← 1 2 3 4 5 →