Linear-time construction of compressed suffix arrays using o(n log n)-bit working space for large alphabets

被引:0
|
作者
Na, JC [1 ]
机构
[1] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The suffix array is a fundamental index data structure in string algorithms and bioinformatics, and the compressed suffix array (CSA) and the FM-index are its compressed versions. Many algorithms for constructing these index data structures have been developed. Recently, Hon et al. [11] proposed a construction algorithm using O(n center dot log log vertical bar Sigma vertical bar) time and O(n log vertical bar Sigma vertical bar)-bit working space, which is the fastest algorithm using O(n log vertical bar Sigma vertical bar)-bit working space. In this paper we give an efficient algorithm to construct the index data structures for large alphabets. Our algorithm constructs the suffix array, the CSA, and the FM-index using O(n) time and O(n log vertical bar Sigma vertical bar log(vertical bar Sigma vertical bar)(alpha) n)-bit working space, where alpha = log(3)2. Our algorithm takes less time and more space than Hon et al.'s algorithm. Our algorithm uses least working space among alphabet-independent linear-time algorithms.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 41 条