Efficient dictionary matching by Aho-Corasick automata of truncated patterns

被引:0
|
作者
Zhang, Meng [1 ]
Fan, Jiashu [1 ]
Chen, Dequan [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
algorithm; dictionary matching; Aho-Corasick automaton;
D O I
10.1504/IJCSM.2016.078738
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
We present a space-efficient data structure for dictionary matching. We truncate patterns to truncated patterns where symbols are l-length substrings of the pattern. By employing the AC automaton of truncated patterns and that of l-length substrings, we simulate the AC automaton of the original pattern set. The new structure is space economical as we apply the prefix merging to substrings of patterns. Using this structure, the dictionary matching runs in O(n log k + tocc log k + occ) time where n is the length of the text, k the number of patterns, occ the number of occurrences of patterns in the text, and tocc the number of occurrences of strings that are longest prefix of each pattern with length of a multiple of l.
引用
收藏
页码:323 / 329
页数:7
相关论文
共 50 条
  • [21] Performance Optimization of Aho-Corasick Algorithm on a GPU
    Nhat-Phuong Tran
    Lee, Myungho
    Hong, Sugwon
    Bae, Jongwoo
    2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013), 2013, : 1143 - 1152
  • [22] Speed-up of Aho-Corasick pattern matching machines by rearranging states
    Nishimura, T
    Fukamachi, S
    Shinohara, T
    EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2001, : 175 - 185
  • [23] A Probability Model Chinese Word Segmentation Algorithm Based on Aho-Corasick Automata Algorithm
    Xu Y.-B.
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2017, 46 (02): : 426 - 433
  • [24] Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures
    Tumeo, Antonino
    Villa, Oreste
    Chavarria-Miranda, Daniel G.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (03) : 436 - 443
  • [25] An Optimized Parallel Failure-less Aho-Corasick Algorithm for DNA Sequence Matching
    Thambawita, V. L. B.
    Ragel, Roshan G.
    Elkaduwe, Dhammike
    2016 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS): INTEROPERABLE SUSTAINABLE SMART SYSTEMS FOR NEXT GENERATION, 2016,
  • [26] Improved Keyword Recognition Based on Aho-Corasick Automaton
    Guo, Yachao
    Qiu, Zhibin
    Huang, Hao
    Siong, Chng Eng
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [27] Extension of Aho-Corasick Algorithm to Detect Injection Attacks
    Rejeb, Jale
    Srinivasan, Mahalakshmi
    ADVANCES IN COMPUTER AND INFORMATIOM SCIENCES AND ENGINEERING, 2008, : 207 - 212
  • [28] A Table Compression Method for Extended Aho-Corasick Automaton
    Liu, Yanbing
    Yang, Yifu
    Liu, Ping
    Tan, Jianlong
    IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2009, 5642 : 84 - 93
  • [29] A MULTI-CHARACTER TRANSITION STRING MATCHING ARCHITECTURE BASED ON AHO-CORASICK ALGORITHM
    Chen, Chien-Chi
    Wang, Sheng-De
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (12): : 8367 - 8386
  • [30] Multiple-pattern matching in LZW compressed files using Aho-Corasick algorithm
    Tao, T
    Mukherjee, A
    DCC 2005: Data Compression Conference, Proceedings, 2005, : 482 - 482