Distribution-Aware Compressed Full-Text Indexes

被引:6
|
作者
Ferragina, Paolo [1 ]
Siren, Jouni [2 ]
Venturini, Rossano [1 ]
机构
[1] Univ Pisa, Dipartimento Informat, I-56127 Pisa, Italy
[2] Univ Helsinki, Dept Comp Sci, SF-00510 Helsinki, Finland
基金
芬兰科学院;
关键词
Full-text indexing; Compressed full-text indexes; Succinct data structures; Dynamic programming; K-LINK PATH; WEIGHT; GRAPHS;
D O I
10.1007/s00453-013-9782-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice.
引用
收藏
页码:529 / 546
页数:18
相关论文
共 50 条
  • [1] Distribution-Aware Compressed Full-Text Indexes
    Paolo Ferragina
    Jouni Sirén
    Rossano Venturini
    Algorithmica, 2013, 67 : 529 - 546
  • [2] Distribution-Aware Compressed Full-Text Indexes
    Ferragina, Paolo
    Siren, Jouni
    Venturini, Rossano
    ALGORITHMS - ESA 2011, 2011, 6942 : 760 - 771
  • [3] Compressed full-text indexes
    Navarro, Gonzalo
    Makinen, Veli
    ACM COMPUTING SURVEYS, 2007, 39 (01)
  • [4] Compressed Representations of Sequences and Full-Text Indexes
    Ferragina, Paolo
    Manzini, Giovanni
    Makinen, Veli
    Navarro, Gonzalo
    ACM TRANSACTIONS ON ALGORITHMS, 2007, 3 (02)
  • [5] Improved compressed indexes for full-text document retrieval
    Belazzougui, Djamal
    Navarro, Gonzalo
    Valenzuela, Daniel
    JOURNAL OF DISCRETE ALGORITHMS, 2013, 18 : 3 - 13
  • [6] Improved Compressed Indexes for Full-Text Document Retrieval
    Belazzougui, Djamal
    Navarro, Gonzalo
    STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 386 - +
  • [7] Dynamic entropy-compressed sequences and full-text indexes
    Makinen, Veli
    Navarro, Gonzalo
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2006, 4009 : 306 - 317
  • [8] Dynamic Entropy-Compressed Sequences and Full-Text Indexes
    Maekinen, Veli
    Navarro, Gonzalo
    ACM TRANSACTIONS ON ALGORITHMS, 2008, 4 (03)
  • [9] Computing Matching Statistics and Maximal Exact Matches on Compressed Full-Text Indexes
    Ohlebusch, Enno
    Gog, Simon
    Kuegel, Adrian
    STRING PROCESSING AND INFORMATION RETRIEVAL, 2010, 6393 : 347 - 358
  • [10] Full-text indexes in external memory
    Kärkkäinen, J
    Rao, SS
    ALGORITHMS FOR MEMORY HIERARCHIES: ADVANCED LECTURES, 2003, 2625 : 149 - 170