We consider the problem of encoding a string of length n from an integer alphabet of size sigma so access, substring equality, and Longest Common Extension (LCE) queries can be answered efficiently. We describe a new space-optimal data structure supporting logarithmic-time queries. Access and substring equality query times can furthermore be improved to the optimal O(1) if O(log n) additional precomputed words are allowed in the total space. Additionally, we provide in-place algorithms for converting between the string and our data structure. Using this new string representation, we obtain the first in-place subquadratic algorithms for several string-processing problems in the restore model: The input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to build a correct version of our data structure using small working space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in 0(n log n) time w.h.p. and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n(1.5)root log sigma) time w.h.p.
机构:
Ecole Natl Super Elect & Ses Applicat, ETIS Lab, Multimedia Indexing & Data Integrat Team, Paris, FranceEcole Natl Super Elect & Ses Applicat, ETIS Lab, Multimedia Indexing & Data Integrat Team, Paris, France
Picard, David
Gosselin, Philippe-Henri
论文数: 0引用数: 0
h-index: 0
机构:
LIP6 Lab, Paris, France
ETIS Lab, Cergy, FranceEcole Natl Super Elect & Ses Applicat, ETIS Lab, Multimedia Indexing & Data Integrat Team, Paris, France
Gosselin, Philippe-Henri
Gaspard, Marie-Claude
论文数: 0引用数: 0
h-index: 0
机构:
Dept Informat Technol, Bibliotheque Natl France BnF, Paris, FranceEcole Natl Super Elect & Ses Applicat, ETIS Lab, Multimedia Indexing & Data Integrat Team, Paris, France
机构:
Childrens Mercy Hosp, Hlth Serv & Outcomes Res, Kansas City, MO 64108 USA
Univ Missouri, Dept Biomed & Hlth Informat, Kansas City, MO 64110 USAChildrens Mercy Hosp, Hlth Serv & Outcomes Res, Kansas City, MO 64108 USA
Dai, Hongying
Wu, Guodong
论文数: 0引用数: 0
h-index: 0
机构:
Lovelace Resp Res Inst, Albuquerque, NM USAChildrens Mercy Hosp, Hlth Serv & Outcomes Res, Kansas City, MO 64108 USA
Wu, Guodong
Wu, Michael
论文数: 0引用数: 0
h-index: 0
机构:
Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Biostat & Biomath Program, 1124 Columbia St, Seattle, WA 98104 USAChildrens Mercy Hosp, Hlth Serv & Outcomes Res, Kansas City, MO 64108 USA
Wu, Michael
Zhi, Degui
论文数: 0引用数: 0
h-index: 0
机构:
Univ Alabama Birmingham, Dept Biostat, Birmingham, AL 35294 USAChildrens Mercy Hosp, Hlth Serv & Outcomes Res, Kansas City, MO 64108 USA