Improved Compressed Indexes for Full-Text Document Retrieval

被引:0
|
作者
Belazzougui, Djamal [1 ,2 ]
Navarro, Gonzalo [2 ]
机构
[1] Univ Paris 07, LIAFA, F-75221 Paris 05, France
[2] Univ Chile, Dept Comp Sci, Santiago, Chile
关键词
EFFICIENT ALGORITHMS; QUERIES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We give new space/time tradeoffs for compressed indexes that answer document retrieval queries on general sequences. On a collection of D documents of total length 72, current approaches require at least vertical bar CSA vertical bar + O(n lg D/1g lg D) or 2 vertical bar CSA vertical bar + o(n) bits of space, where CSA is a full-text index. Using monotone mininum perfect hash functions, we give new algorithms for document listing with frequencies and top-k document retrieval using just vertical bar CSA vertical bar + O(n lg lg lg D) bits. We also improve current solutions that use 2 vertical bar CSA vertical bar + o(n) bits, and consider other problems such as colored range listing, top-k, most important documents, and computing arbitrary frequencies.
引用
收藏
页码:386 / +
页数:3
相关论文
共 50 条