Distribution-Aware Compressed Full-Text Indexes

被引:6
|
作者
Ferragina, Paolo [1 ]
Siren, Jouni [2 ]
Venturini, Rossano [1 ]
机构
[1] Univ Pisa, Dipartimento Informat, I-56127 Pisa, Italy
[2] Univ Helsinki, Dept Comp Sci, SF-00510 Helsinki, Finland
基金
芬兰科学院;
关键词
Full-text indexing; Compressed full-text indexes; Succinct data structures; Dynamic programming; K-LINK PATH; WEIGHT; GRAPHS;
D O I
10.1007/s00453-013-9782-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice.
引用
收藏
页码:529 / 546
页数:18
相关论文
共 50 条
  • [41] Compressed indexes for text with wildcards
    Thachuk, Chris
    THEORETICAL COMPUTER SCIENCE, 2013, 483 : 22 - 35
  • [42] Distribution-Aware Sampling of Answer Sets
    Nickles, Matthias
    SCALABLE UNCERTAINTY MANAGEMENT (SUM 2018), 2018, 11142 : 164 - 180
  • [43] ResearchGate Score, full-text research items, and full-text reads: a follow-up study
    Sergio Copiello
    Pietro Bonifaci
    Scientometrics, 2019, 119 : 1255 - 1262
  • [44] ResearchGate Score, full-text research items, and full-text reads: a follow-up study
    Copiello, Sergio
    Bonifaci, Pietro
    SCIENTOMETRICS, 2019, 119 (02) : 1255 - 1262
  • [45] Distribution-Aware Crowdsourced Entity Collection
    Fan, Ju
    Wei, Zhewei
    Zhang, Dongxiang
    Yang, Jingru
    Du, Xiaoyong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (07) : 1312 - 1326
  • [46] Distribution-aware fairness test generation
    Rajan, Sai Sathiesh
    Soremekun, Ezekiel
    Le Traon, Yves
    Chattopadhyay, Sudipta
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 215
  • [47] Full-text information retrieval: Introduction
    Sievert, MC
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1996, 47 (04): : 261 - 262
  • [48] FULL-TEXT OF THE HAZARDOUS MATERIALS TABLE
    不详
    HAZARDOUS WASTE CONSULTANT, 1995, 13 (05) : S1 - S206
  • [49] Sociology: A sage full-text collection
    Oka, C
    LIBRARY JOURNAL, 2003, 128 (15) : 101 - 101
  • [50] COMMENTS ON THE AND OPERATOR IN FULL-TEXT SEARCHING
    BASCH, R
    DATABASE, 1990, 13 (01): : 86 - 86