Distribution-Aware Compressed Full-Text Indexes

被引:6
|
作者
Ferragina, Paolo [1 ]
Siren, Jouni [2 ]
Venturini, Rossano [1 ]
机构
[1] Univ Pisa, Dipartimento Informat, I-56127 Pisa, Italy
[2] Univ Helsinki, Dept Comp Sci, SF-00510 Helsinki, Finland
基金
芬兰科学院;
关键词
Full-text indexing; Compressed full-text indexes; Succinct data structures; Dynamic programming; K-LINK PATH; WEIGHT; GRAPHS;
D O I
10.1007/s00453-013-9782-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice.
引用
收藏
页码:529 / 546
页数:18
相关论文
共 50 条
  • [21] Humanities full-text
    Williams, H
    LIBRARY JOURNAL, 2003, 128 (05) : 124 - 124
  • [22] Layout-aware text extraction from full-text PDF of scientific articles
    Ramakrishnan, Cartic
    Patnia, Abhishek
    Hovy, Eduard
    Burns, Gully A. P. C.
    SOURCE CODE FOR BIOLOGY AND MEDICINE, 2012, 7 (01):
  • [23] Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes
    Veretennikov, Alexander B.
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 936 - 954
  • [24] High-Performance Annotation Tagging over Solr Full-text Indexes
    Artini, Michele
    Atzori, Claudio
    La Bruzzo, Sandro
    Manghi, Paolo
    Mikulicic, Marko
    Bardi, Alessia
    INFORMATION TECHNOLOGY AND LIBRARIES, 2014, 33 (03) : 22 - 44
  • [25] Full-text searching in Perl
    Kientzle, T
    DR DOBBS JOURNAL, 1999, 24 (01): : 34 - +
  • [26] SEARCHING FULL-TEXT DATABASES
    TENOPIR, C
    LIBRARY JOURNAL, 1988, 113 (08) : 60 - 61
  • [27] FULL-TEXT AND BIBLIOGRAPHIC DATABASES
    TENOPIR, C
    LIBRARY JOURNAL, 1985, 110 (19) : 62 - 63
  • [28] FULL-TEXT INFORMATION RETRIEVAL
    FAY, RJ
    LAW LIBRARY JOURNAL, 1971, 64 (02): : 167 - 175
  • [29] FULL-TEXT DATABASES IN MEDICINE
    SIEVERT, MC
    MCKININ, EJ
    JOHNSON, ED
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1995, 46 (10): : 748 - 754
  • [30] Harvesting for full-text retrieval
    Simeoni, F
    Yakici, M
    Neely, S
    Crestani, F
    DIGITAL LIBRARIES: IMPLEMENTING STRATEGIES AND SHARING EXPERIENCES, PROCEEDINGS, 2005, 3815 : 204 - 213