Investigating Bloom Filters for Web Archives' Holdings

被引:2
|
作者
Klein, Martin [1 ]
Balakireva, Lyudmila [1 ]
Holub, Karolina [2 ]
Celjak, Drazenko [3 ]
Rudomino, Ingeborg [2 ]
机构
[1] Los Alamos Natl Lab, Los Alamos, NM 87545 USA
[2] Natl & Univ Lib Zagreb, Zagreb, Croatia
[3] Univ Zagreb Univ, Comp Ctr, Zagreb, Croatia
关键词
bloom filters; web archives; web archive profiling; index sharing;
D O I
10.1145/3529372.3530934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
What web archives hold is often opaque to the public and even experts in the domain struggle to provide precise assessments. Given the increasing need for and use of crawled and archived web resources, discovery of individual records as well as sharing of entire holdings are pressing use cases. We investigate Bloom Filters (BFs) and their applicability to address these use cases. We experiment with and analyze parameters for their creation, measure their performance, outline an approach for scalability, and describe various pilot implementations that showcase their potential to meet our needs. BFs come with beneficial characteristics and hence have enjoyed popularity in various domains. We highlight their suitability for web archiving use cases and how they can contribute to very fast and accurate search services.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] THE LUDWIGSBURG STATE ARCHIVES, SUMMARY OF THE HOLDINGS (GERMAN)
    EBERL, I
    HISTORISCHES JAHRBUCH, 1994, 114 (01) : 283 - 283
  • [22] Is web archives a misnomer - How web archives can become digital archives?
    Wu, Paul Horng Jyh
    Heok, Adrian Kay Heng
    PROCEEDINGS OF THE ASIA-PACIFIC CONFERENCE ON LIBRARY & INFORMATION EDUCATION & PRACTICE 2006: PREPARING INFORMATION PROFESSIONALS FOR LEADERSHIP IN THE NEW AGE, 2006, : 298 - +
  • [23] Cooperative Web Caching Using Dynamic Interest-Tagged Filtered Bloom Filters
    Alexander, Holly
    Khalil, Ibrahim
    Cameron, Conor
    Tari, Zahir
    Zomaya, Albert
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (11) : 2956 - 2969
  • [24] VATICAN ARCHIVES - SURVEY OF VATICAN ARCHIVES AND OF ITS MEDIEVAL HOLDINGS - BOYLE,LE
    CHENEY, CR
    ARCHIVES, 1973, 11 (50): : 97 - 98
  • [25] RECORDS HOLDINGS OF THE ILLINOIS STATE ARCHIVES - CASSADY,TJ
    SANTEN, VB
    AMERICAN ARCHIVIST, 1958, 21 (04): : 429 - 429
  • [26] THE WESTPHALIAN ECONOMIC ARCHIVES AND THEIR HOLDINGS - GERMAN - DASCHER,O
    DORNSEIFER, B
    ARCHIV FUR SOZIALGESCHICHTE <D>, 1993, 33 : 741 - 741
  • [27] Access to contemporary holdings of the National Archives of Tunisia: an inventory
    Ben Hamouda, Houda
    ANNEE DU MAGHREB, 2014, 10
  • [28] NEED FOR A SURVEY OF CANADIAN ARCHIVES WITH HOLDINGS OF ETHNOMUSICOLOGICAL INTEREST
    LANDRY, R
    ETHNOMUSICOLOGY, 1972, 16 (03) : 504 - 512
  • [29] Bloom Filters in Adversarial Environments
    Naor, Moni
    Eylon, Yogev
    ACM TRANSACTIONS ON ALGORITHMS, 2019, 15 (03)
  • [30] On the Privacy of Counting Bloom Filters
    Reviriego, Pedro
    Sanchez-Macian, Alfonso
    Walzer, Stefan
    Merino-Gomez, Elena
    Liu, Shanshan
    Lombardi, Fabrizio
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (02) : 1488 - 1499