Efficient POSIX submatch extraction on nondeterministic finite automata

被引:3
|
作者
Borsotti, Angelo [1 ]
Trofimovich, Ulya [2 ]
机构
[1] Polytech Univ Milan, Dept Elect Informat & Bioengn, Milan, Italy
[2] Belarusian State Univ, Dept Discrete Math & Algorithm, Minsk, BELARUS
来源
SOFTWARE-PRACTICE & EXPERIENCE | 2021年 / 51卷 / 02期
关键词
finite-state automata; parsing; POSIX; regular expressions; submatch extraction;
D O I
10.1002/spe.2881
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we study the performance of POSIX submatch extraction algorithms based on nondeterministic finite automata (NFA). We propose an algorithm that combines Laurikari tagged NFA and extended Okui-Suzuki disambiguation. The algorithm works in worst-caseO(n m(2) t)time andO(m(2))space (including preprocessing), wherenis the length of input,mis the size of the regular expression with bounded repetition expanded andtis the number of capturing groups and subexpressions that contain them. On real-world benchmarks our algorithm performs close to theO(n m t)complexity of leftmost-greedy matching, although on artificial benchmarks it can be significantly slower. We propose a lazy version of the algorithm that runs much faster, but requiresO(n m(2))space. We show that the Kuklewicz algorithm is slower in practice, and the backward matching algorithm proposed by Cox is incorrect.
引用
收藏
页码:159 / 192
页数:34
相关论文
共 50 条
  • [31] Learning regular languages using nondeterministic finite automata
    Garcia, Pedro
    Vazquez de Parga, Manuel
    Alvarez, Gloria I.
    Ruiz, Jose
    IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2008, 5148 : 92 - +
  • [32] Parallel Algorithms for Minimal Nondeterministic Finite Automata Inference
    Jastrzab, Tomasz
    Czech, Zbigniew J.
    Wieczorek, Wojciech
    FUNDAMENTA INFORMATICAE, 2021, 178 (03) : 203 - 227
  • [33] State complexity of basic operations on nondeterministic finite automata
    Holzer, M
    Kutrib, M
    IMPLEMENTATION AND APPLICATION OF AUTOMATA, 2003, 2608 : 148 - 157
  • [34] Implementation of nondeterministic finite automata in an autoassociative CAM circuit
    Poikonen, Jussi H.
    Lehtonen, Eero
    Laiho, Mika
    Knuutila, Timo
    2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2015, : 1342 - 1345
  • [35] Some Algorithms for Equivalent Transformation of Nondeterministic Finite Automata
    Mel'nikov, B. F.
    Saifullina, M. R.
    RUSSIAN MATHEMATICS, 2009, 53 (04) : 54 - 57
  • [36] ReFaM: a software tool for minimizing nondeterministic finite automata
    Tsyganov, Andrey V.
    14TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2012), 2012, : 187 - 191
  • [37] SIMULATION OF NONDETERMINISTIC TURING MACHINES WITH FINITE STATE AUTOMATA
    Mycka, Jerzy
    Piekarz, Monika
    APLIMAT 2005 - 4TH INTERNATIONAL CONFERENCE, PT II, 2005, : 323 - 328
  • [38] More on Minimizing Finite Automata with Errors - Nondeterministic Machines
    Holzer, Markus
    Jakobi, Sebastian
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2017, 28 (03) : 229 - 245
  • [40] On the power of finite automata with both nondeterministic and probabilistic states
    Condon, A
    Hellerstein, L
    Pottle, S
    Wigderson, A
    SIAM JOURNAL ON COMPUTING, 1998, 27 (03) : 739 - 762