Online Pattern Matching for String Edit Distance with Moves

被引:0
|
作者
Takabatake, Yoshimasa [1 ]
Tabei, Yasuo [2 ]
Sakamoto, Hiroshi [1 ]
机构
[1] Kyushu Inst Technol, Kitakyushu, Fukuoka, Japan
[2] Japan Sci & Technol Agcy, PRESTO, Tokyo 1028666, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string to the other. Although optimizing EDM is intractable, it has many applications especially in error detections. Edit sensitive parsing (ESP) is an efficient parsing algorithm that guarantees an upper bound of parsing discrepancies between different appearances of the same substrings in a string. ESP can be used for computing an approximate EDM as the L-1 distance between characteristic vectors built by node labels in parsing trees. However, ESP is not applicable to a streaming text data where a whole text is unknown in advance. We present an online ESP (OESP) that enables an online pattern matching for EDM. OESP builds a parse tree for a streaming text and computes the L-1 distance between characteristic vectors in an online manner. For the space-efficient computation of EDM, OESP directly encodes the parse tree into a succinct representation by leveraging the idea behind recent results of a dynamic succinct tree. We experimentally test OESP on the ability to compute EDM in an online manner on benchmark datasets, and we show OESP's efficiency.
引用
收藏
页码:203 / 214
页数:12
相关论文
共 50 条
  • [31] Classes of cost functions for string edit distance
    Rice, SV
    Bunke, H
    Nartker, TA
    ALGORITHMICA, 1997, 18 (02) : 271 - 280
  • [32] Bounded Occurrence Edit Distance: A New Metric for String Similarity Joins with Edit Distance Constraints
    Komatsu, Tomoki
    Okuta, Ryosuke
    Narisawa, Kazuyuki
    Shinohara, Ayumi
    SOFSEM 2014: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2014, 8327 : 363 - 374
  • [33] Efficient Online String Matching Based on Characters Distance Text Sampling
    Faro, Simone
    Marino, Francesco Pio
    Pavone, Arianna
    ALGORITHMICA, 2020, 82 (11) : 3390 - 3412
  • [34] Efficient Online String Matching Based on Characters Distance Text Sampling
    Simone Faro
    Francesco Pio Marino
    Arianna Pavone
    Algorithmica, 2020, 82 : 3390 - 3412
  • [35] Edit distance for a run-length-encoded string and an uncompressed string
    Liu, J. J.
    Huang, G. S.
    Wang, Y. L.
    Lee, R. C. T.
    INFORMATION PROCESSING LETTERS, 2007, 105 (01) : 12 - 16
  • [36] Faster Pattern Matching under Edit Distance A Reduction to Dynamic Puzzle Matching and the Seaweed Monoid of Permutation Matrices
    Charalampopoulos, Panagiotis
    Kociumaka, Tomasz
    Wellnitz, Philip
    2022 IEEE 63RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2022, : 698 - 707
  • [37] Matching Patterns with Variables Under Edit Distance
    Gawrychowski, Pawel
    Manea, Florin
    Siemer, Stefan
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2022, 2022, 13617 : 275 - 289
  • [38] Efficient relational matching with local edit distance
    Myers, R
    Wilson, RC
    Hancock, ER
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 1711 - 1714
  • [39] Computing the Expected Edit Distance from a String to a PFA
    Calvo-Zaragoza, Jorge
    de la Higuera, Colin
    Oncina, Jose
    Implementation and Application of Automata, 2016, 9705 : 39 - 50
  • [40] An algorithm for string edit distance allowing substring reversals
    Arslan, Abdullah N.
    BIBE 2006: SIXTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2006, : 220 - +