A Generic Image Retrieval Method for Date Estimation of Historical Document Collections

被引:1
|
作者
Molina, Adria [1 ]
Gomez, Lluis
Ramos Terrades, Oriol
Llados, Josep
机构
[1] Univ Autonoma Barcelona, Comp Vis Ctr, Bellaterra, Catalunya, Spain
来源
DOCUMENT ANALYSIS SYSTEMS, DAS 2022 | 2022年 / 13237卷
关键词
Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG;
D O I
10.1007/978-3-031-06555-2_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images.
引用
收藏
页码:583 / 597
页数:15
相关论文
共 50 条
  • [1] Retrieval from document image collections
    Balasubramanian, A
    Meshesha, M
    Jawahar, C
    DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 1 - 12
  • [2] Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach
    Molina, Adria
    Riba, Pau
    Gomez, Lluis
    Ramos-Terrades, Oriol
    Llados, Josep
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 306 - 320
  • [3] Content-based document image retrieval in complex document collections
    Agam, G.
    Argamon, S.
    Friedera, O.
    Grossman, D.
    Lewis, D.
    DOCUMENT RECOGNITION AND RETRIEVAL XIV, 2007, 6500
  • [4] INFORMATION RETRIEVAL FROM HISTORICAL DOCUMENT IMAGE BASE
    Khurshid, Khurram
    Siddiqi, Imran
    Faure, Claudie
    Vincent, Nicole
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 188 - 193
  • [5] Document Retrieval on Repetitive Collections
    Navarro, Gonzalo
    Puglisi, Simon J.
    Siren, Jouni
    ALGORITHMS - ESA 2014, 2014, 8737 : 725 - 736
  • [6] Visual exploration and retrieval of XML document collections with the generic system X-2
    Meuss, Holger
    Schulz, Klaus U.
    Weigel, Felix
    Leonardi, Simone
    Bry, Francois
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2005, 5 (01) : 3 - 17
  • [7] Reverse Annotation Based Retrieval from Large Document Image Collections
    Sankar, Pramod K.
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 921 - 921
  • [8] Eigenspace method for text retrieval in historical document images
    Terasawa, K
    Nagasaki, T
    Kawashima, T
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 437 - 441
  • [9] Document retrieval on repetitive string collections
    Gagie, Travis
    Hartikainen, Aleksi
    Karhu, Kalle
    Karkkainen, Juha
    Navarro, Gonzalo
    Puglisi, Simon J.
    Siren, Jouni
    INFORMATION RETRIEVAL JOURNAL, 2017, 20 (03): : 253 - 291
  • [10] Document retrieval on repetitive string collections
    Travis Gagie
    Aleksi Hartikainen
    Kalle Karhu
    Juha Kärkkäinen
    Gonzalo Navarro
    Simon J. Puglisi
    Jouni Sirén
    Information Retrieval Journal, 2017, 20 : 253 - 291