Exact And Approximate Pattern Matching In The Streaming Model

被引:39
|
作者
Porat, Benny [1 ]
Porat, Ely [1 ]
机构
[1] Bar Ilan Univ, IL-52100 Ramat Gan, Israel
关键词
K-MISMATCHES; ALGORITHMS;
D O I
10.1109/FOCS.2009.11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a fully online randomized algorithm for the classical pattern matching problem that uses merely O(log m) space(1), breaking the O(m) barrier that held for this problem for a long time. Our method can be used as a tool in many practical applications, including monitoring Internet traffic and firewall applications. In our online model we first receive the pattern P of size m and preprocess it. After the preprocessing phase, the characters of the text T of size n arrive one at a time in an online fashion. For each index of the text input we indicate whether the pattern matches the text at that location index or not. Clearly, for index i, an indication can only be given once all characters from index i till index i + m - 1 have arrived. Our goal is to provide such answers while using minimal space, and while spending as little time as possible on each character (time and space which are in O(poly log(n))). We present an algorithm whereby both false positive and false negative answers are allowed with probability of at most 1/n(3). Thus, overall, the correct answer for all positions is returned with a probability of 1/n(2). The time which our algorithm spends on each input character is bounded by O(log m), and the space complexity is O(log m) words. We also present a solution in the same model for the pattern matching with k mismatches problem. In this problem, a match means allowing up to k symbol mismatches between the pattern and the subtext beginning at index i. We provide an algorithm in which the time spent on each character is bounded by O(k(2) poly(log m)), and the space complexity is O(k(3) poly(log m)) words.
引用
收藏
页码:315 / 323
页数:9
相关论文
共 50 条
  • [21] APPROXIMATE PATTERN-MATCHING IN A PATTERN DATABASE SYSTEM
    DAVIS, LS
    ROUSSOPOULOS, N
    INFORMATION SYSTEMS, 1980, 5 (02) : 107 - 119
  • [22] Fast exact pattern matching algorithm
    College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
    J. Comput. Inf. Syst., 2009, 1 (235-243):
  • [23] Distributed Incremental Pattern Matching on Streaming Graphs
    Kao, Jyun-Sheng
    Chou, Jerry
    PROCEEDINGS OF THE ACM WORKSHOP ON HIGH PERFORMANCE GRAPH PROCESSING (HPGP'16), 2016, : 43 - 50
  • [24] Streaming Regular Expression Membership and Pattern Matching
    Dudek, Bartlomiej
    Gawrychowski, Pawel
    Gourdel, Garance
    Starikovskaya, Tatiana
    PROCEEDINGS OF THE 2022 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2022, : 670 - 694
  • [25] Parikh Matching in the Streaming Model
    Lee, Lap-Kei
    Lewenstein, Moshe
    Zhang, Qin
    STRING PROCESSING AND INFORMATION RETRIEVAL: 19TH INTERNATIONAL SYMPOSIUM, SPIRE 2012, 2012, 7608 : 336 - 341
  • [26] Average complexity of exact and approximate multiple string matching
    Navarro, G
    Fredriksson, K
    THEORETICAL COMPUTER SCIENCE, 2004, 321 (2-3) : 283 - 290
  • [27] Indexing Variable Length Substrings for Exact and Approximate Matching
    Navarro, Gonzalo
    Salmela, Leena
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5721 : 214 - +
  • [28] Exact and approximate graph matching using random walks
    Gori, M
    Maggini, M
    Sarti, L
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (07) : 1100 - 1111
  • [29] Towards exact and inexact approximate matching of executable binaries
    Liebler, Lorenz
    Baier, Harald
    DIGITAL INVESTIGATION, 2019, 28 : S12 - S21
  • [30] Approximate Pattern Matching for DNA Sequence Data
    Patil, Nagamma
    Toshniwal, Durga
    Garg, Kumkum
    COMPUTER NETWORKS AND INFORMATION TECHNOLOGIES, 2011, 142 : 212 - 218