A compression method of double-array structures using linear functions

被引:7
|
作者
Kanda, Shunsuke [1 ]
Fuketa, Masao [1 ]
Morita, Kazuhiro [1 ]
Aoe, Jun-ichi [1 ]
机构
[1] Univ Tokushima, Dept Informat Sci & Intelligent Syst, Minamijosanjima 2-1, Tokushima 7708506, Japan
关键词
Trie; Double-array; Compression method; Information retrieval; ALGORITHM;
D O I
10.1007/s10115-015-0873-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A trie is one of the data structures for keyword search algorithms and is utilized in natural language processing, reserved words search for compilers and so on. The double-array and LOUDS are efficient representation methods for the trie. The double-array provides fast traversal at time complexity of O(1), but the space usage of the double-array is larger than that of LOUDS. LOUDS is a succinct data structure with bit-string, and its space usage is extremely compact. However, its traversal speed is not so fast. This paper presents a new compression method of the double-array with keeping the retrieval speed. Our new method compresses the double-array by dividing the double-array into blocks and by using linear functions. Experimental results for varied keywords show that our new method reduced space usage of the double-array up to about 44 %, and the retrieval speed of the new method was 9-14 times faster than that of LOUDS. Moreover, the results show that the construction speed of the new method was faster than that of the conventional method for a large keyword set.
引用
收藏
页码:55 / 80
页数:26
相关论文
共 50 条
  • [21] Neural response telemetry in patients with the double-array cochlear implant
    Maria Valéria Goffi-Gomez
    Carolina F. Abdala
    Cristina Gomes Ornelas Peralta
    Robinson Koji Tsuji
    Rubens Vuono de Brito Neto
    Ricardo Ferreira Bento
    European Archives of Oto-Rhino-Laryngology, 2010, 267 : 515 - 522
  • [22] Research of Chinese word segmentation based on Double-Array Trie
    School of Computer and Communication, Hunan Univ., Changsha 410082, China
    Hunan Daxue Xuebao, 2009, 5 (77-80):
  • [23] Neural response telemetry in patients with the double-array cochlear implant
    Goffi-Gomez, Maria Valaria
    Abdala, Carolina F.
    Ornelas Peralta, Cristina Gomes
    Tsuji, Robinson Koji
    de Brito Neto, Rubens Vuono
    Bento, Ricardo Ferreira
    EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2010, 267 (04) : 515 - 522
  • [24] Comparative Study on the Double-Array Structure for Large English & Chinese Lexicons
    Xu, Shuo
    Zhu, Li-Jun
    Qiao, Xiao-Dong
    ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL IV, PROCEEDINGS, 2009, : 158 - 162
  • [25] Engineering faster double-array Aho-Corasick automata
    Kanda, Shunsuke
    Akabe, Koichi
    Oda, Yusuke
    SOFTWARE-PRACTICE & EXPERIENCE, 2023, 53 (06): : 1332 - 1361
  • [26] A new compression method of double array for compact dictionaries
    Fuketa, M
    Morita, K
    Sumitomo, T
    Kashiji, S
    Atlam, E
    Aoe, JI
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2004, 81 (08) : 943 - 953
  • [27] An efficient representation for implementing finite state machines based on the double-array
    Mizobuchi, S
    Sumitomo, T
    Fuketa, M
    Aoe, J
    INFORMATION SCIENCES, 2000, 129 (1-4) : 119 - 139
  • [28] Study for the Double-array Trie Tree Based Algorithm in Word Segmentation
    Yang, Wenchuan
    Fang, Zeyang
    Li, Pengfei
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENVIRONMENTAL ENGINEERING (CSEE 2015), 2015, : 440 - 446
  • [29] A construction method by divided double array structures
    Fuketa, Masao
    Kanda, Shunsuke
    International Journal of Intelligent Systems Technologies and Applications, 2015, 14 (3-4) : 273 - 283
  • [30] Compressed double-array tries for string dictionaries supporting fast lookup
    Shunsuke Kanda
    Kazuhiro Morita
    Masao Fuketa
    Knowledge and Information Systems, 2017, 51 : 1023 - 1042