Linear-time construction of compressed suffix arrays using o(n log n)-bit working space for large alphabets

被引:0
|
作者
Na, JC [1 ]
机构
[1] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The suffix array is a fundamental index data structure in string algorithms and bioinformatics, and the compressed suffix array (CSA) and the FM-index are its compressed versions. Many algorithms for constructing these index data structures have been developed. Recently, Hon et al. [11] proposed a construction algorithm using O(n center dot log log vertical bar Sigma vertical bar) time and O(n log vertical bar Sigma vertical bar)-bit working space, which is the fastest algorithm using O(n log vertical bar Sigma vertical bar)-bit working space. In this paper we give an efficient algorithm to construct the index data structures for large alphabets. Our algorithm constructs the suffix array, the CSA, and the FM-index using O(n) time and O(n log vertical bar Sigma vertical bar log(vertical bar Sigma vertical bar)(alpha) n)-bit working space, where alpha = log(3)2. Our algorithm takes less time and more space than Hon et al.'s algorithm. Our algorithm uses least working space among alphabet-independent linear-time algorithms.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 41 条
  • [21] Shortest Paths in Directed Planar Graphs with Negative Lengths: A Linear-Space O(n log2 n)-Time Algorithm
    Klein, Philip N.
    Mozes, Shay
    Weimann, Oren
    ACM TRANSACTIONS ON ALGORITHMS, 2010, 6 (02)
  • [22] A Randomized Algorithm for Finding Frequent Elements in Streams Using O(log log N) Space
    Ogata, Masatora
    Yamauchi, Yukiko
    Kijima, Shuji
    Yamashita, Masafumi
    ALGORITHMS AND COMPUTATION, 2011, 7074 : 514 - 523
  • [23] Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics
    Ji, Xiang
    Zhang, Zhenyu
    Holbrook, Andrew
    Nishimura, Akihiko
    Baele, Guy
    Rambaut, Andrew
    Lemey, Philippe
    Suchard, Marc A.
    MOLECULAR BIOLOGY AND EVOLUTION, 2020, 37 (10) : 3047 - 3060
  • [24] Constructing a cactus for minimum cuts of a graph in O (mn+n2 log n) time and O(m) Space
    Nagamochi, H
    Nakamura, S
    Ishii, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (02): : 179 - 185
  • [25] Space-Efficient SLP Encoding for O(log N)-Time Random Access
    Takasaka, Akito
    Tomohiro, I
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2024, 2025, 14899 : 336 - 347
  • [26] Integer sorting in O(n√loglogn) expected time and linear space
    Han, YJ
    Thorup, M
    FOCS 2002: 43RD ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2002, : 135 - 144
  • [27] RECOVERING K-SPARSE N-LENGTH VECTORS IN O(K log N) TIME: COMPRESSED SENSING USING SPARSE-GRAPH CODES
    Li, Xiao
    Ramchandran, Kannan
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4049 - 4053
  • [28] Sorting Real Numbers in O(n√log n) Time and Linear Space (Dec, 10.1007/s00453-019- 00626-0, 2019)
    Han, Yijie
    ALGORITHMICA, 2020, 82 (04) : 979 - 979
  • [29] Linear and O(n log n) time minimum-cost matching algorithms for quasi-convex tours
    Buss, SR
    Yianilos, PN
    SIAM JOURNAL ON COMPUTING, 1998, 27 (01) : 170 - 201
  • [30] ECONOMIC LOT SIZING - AN O(N LOG N) ALGORITHM THAT RUNS IN LINEAR TIME IN THE WAGNER-WHITIN CASE
    WAGELMANS, A
    VANHOESEL, S
    KOLEN, A
    OPERATIONS RESEARCH, 1992, 40 : S145 - S156