Optimal Substring Equality Queries with Applications to Sparse Text Indexing

被引:3
|
作者
Prezza, Nicola [1 ,2 ]
机构
[1] LUISS Guido Carli, Viale Romania 32, IT-00197 Rome, Italy
[2] Ca Foscari Univ Venice, Venice, Italy
关键词
Substring equality queries; in-place; suffix sorting; LOWER BOUNDS; SUFFIX; CONSTRUCTION;
D O I
10.1145/3426870
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of encoding a string of length n from an integer alphabet of size sigma so access, substring equality, and Longest Common Extension (LCE) queries can be answered efficiently. We describe a new space-optimal data structure supporting logarithmic-time queries. Access and substring equality query times can furthermore be improved to the optimal O(1) if O(log n) additional precomputed words are allowed in the total space. Additionally, we provide in-place algorithms for converting between the string and our data structure. Using this new string representation, we obtain the first in-place subquadratic algorithms for several string-processing problems in the restore model: The input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to build a correct version of our data structure using small working space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in 0(n log n) time w.h.p. and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n(1.5)root log sigma) time w.h.p.
引用
收藏
页数:23
相关论文
共 24 条
  • [11] Geometric BWT: Compressed Text Indexing via Sparse Suffixes and Range Searching
    Chien, Yu-Feng
    Hon, Wing-Kai
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    ALGORITHMICA, 2015, 71 (02) : 258 - 278
  • [12] Geometric BWT: Compressed Text Indexing via Sparse Suffixes and Range Searching
    Yu-Feng Chien
    Wing-Kai Hon
    Rahul Shah
    Sharma V. Thankachan
    Jeffrey Scott Vitter
    Algorithmica, 2015, 71 : 258 - 278
  • [13] Latent semantic indexing (LSI) and its applications in Chinese text processing
    Zhou, S.G.
    Guan, J.H.
    Hu, Y.F.
    Xiaoxing Weixing Jisuanji Xitong/Mini-Micro Systems, 2001, 22 (02):
  • [14] Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing
    Bose, Prosenjit
    He, Meng
    Maheshwari, Anil
    Morin, Pat
    ALGORITHMS AND DATA STRUCTURES, 2009, 5664 : 98 - +
  • [15] Optimal-Time Text Indexing in BWT-runs Bounded Space
    Gagie, Travis
    Navarro, Gonzalo
    Prezza, Nicola
    SODA'18: PROCEEDINGS OF THE TWENTY-NINTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2018, : 1459 - 1477
  • [16] Compressed suffix arrays and suffix trees with applications to text indexing and string matching
    Grossi, R
    Vitter, JS
    SIAM JOURNAL ON COMPUTING, 2005, 35 (02) : 378 - 407
  • [17] OPTIMAL ORDERING ALGORITHM FOR SPARSE-MATRIX APPLICATIONS
    IRISARRI, G
    SASSON, AM
    HODGES, SF
    IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1978, 97 (06): : 2253 - 2261
  • [18] ADMM for Sparse Semidefinite Programming with Applications to Optimal Power Flow Problem
    Madani, Ramtin
    Kalbat, Abdulrahman
    Lavaei, Javad
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 5932 - 5939
  • [19] Optimal screening and discovery of sparse signals with applications to multistage high throughput studies
    Cai, T. Tony
    Sun, Wenguang
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2017, 79 (01) : 197 - 223
  • [20] Processing Long Queries Against Short Text: Top-k Advertisement Matching in News Stream Applications
    Zhang, Dongxiang
    Li, Yuchen
    Fan, Ju
    Gao, Lianli
    Shen, Fumin
    Shen, Heng Tao
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 35 (03)