An Efficient Algorithm for Suffix Sorting

被引:0
|
作者
Peng, Zhan [1 ]
Wang, Yuping [1 ]
Xue, Xingsi [2 ]
Wei, Jingxuan [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Shaanxi, Peoples R China
[2] Fujian Univ Technol, Sch Informat Sci & Engn, Fuzhou 350118, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
Suffix sorting; suffix array; text index; computation biology; CONSTRUCTION; ARRAYS;
D O I
10.1142/S0218001416590187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Sufix Array (SA) is a fundamental data structure which is widely used in the applications such as string matching, text index and computation biology, etc. How to sort the suffixes of a string in lexicographical order is a primary problem in constructing SAs, and one of the widely used suffix sorting algorithms is qsufsort. However, qsufsort suffers one critical limitation that the order of suffixes starting with the same 2(k) characters cannot be determined in the kth round. To this point, in our paper, an efficient suffix sorting algorithm called dsufsort is proposed by overcoming the drawback of the qsufsort algorithm. In particular, our proposal maintains the depth of each unsorted portion of SA, and sorts the suffixes based on the depth in each round. By this means, some suffixes that cannot be sorted by qsufsort in each round can be sorted now, as a result, more sorting results in current round can be utilized by the latter rounds and the total number of sorting rounds will be reduced, which means dsufsort is more efficient than qsufsort. The experimental results show the effectiveness of the proposed algorithm, especially for the text with high repetitions.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] p-Suffix Sorting as Arithmetic Coding
    Beal, Richard
    Adjeroh, Donald
    COMBINATORIAL ALGORITHMS, 2011, 7056 : 44 - 56
  • [42] On the sorting-complexity of suffix tree construction
    Farach-Colton, M
    Ferragina, P
    Muthukrishnan, S
    JOURNAL OF THE ACM, 2000, 47 (06) : 987 - 1011
  • [43] Faster semi-external suffix sorting
    Dhaliwal, Jasbir
    INFORMATION PROCESSING LETTERS, 2014, 114 (04) : 174 - 178
  • [44] p-Suffix sorting as arithmetic coding
    Beal, Richard
    Adjeroh, Donald
    JOURNAL OF DISCRETE ALGORITHMS, 2012, 16 : 151 - 169
  • [45] The performance of linear time suffix sorting algorithms
    Puglisi, SJ
    Smyth, WF
    Turpin, A
    DCC 2005: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2005, : 358 - 367
  • [46] An algorithm for suffix stripping
    Porter, M. F.
    PROGRAM-ELECTRONIC LIBRARY AND INFORMATION SYSTEMS, 2006, 40 (03) : 211 - 218
  • [47] AN ALGORITHM FOR SUFFIX STRIPPING
    PORTER, MF
    PROGRAM-AUTOMATED LIBRARY AND INFORMATION SYSTEMS, 1980, 14 (03): : 130 - 137
  • [48] The Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays
    Lin, Jie
    Jiang, Yue
    Adjeroh, Don
    PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2008, 2008, : 68 - 83
  • [49] An Efficient Differencing Algorithm Based on Suffix Array for Reprogramming Wireless Sensor Networks
    Mo, Biyuan
    Dong, Wei
    Chen, Chun
    Bu, Jiajun
    Wang, Qiang
    2012 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2012,
  • [50] An Efficient Parallel Sorting Algorithm on OTIS Mesh of Trees
    Lucas, Keny T.
    Jana, Prasanta K.
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 175 - +