Subtree Similarity Search Based on Structure and Text

被引:0
|
作者
Mizokami, Takuya [1 ]
Bou, Savong [2 ]
Amagasa, Toshiyuki [2 ]
机构
[1] Univ Tsukuba, Grad Sch Sci & Technol, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan
关键词
Approximate Matching; Similarity search; Tree edit distance; TREE EDIT DISTANCE; ALGORITHMS; EFFICIENT; FRAMEWORK; ROBUST;
D O I
10.1007/978-3-031-68323-7_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a query tree, the subtree similarity search problem is finding all subtrees in a document tree that are similar to the query tree. The previous scan-based method extracts candidate subtrees based on the size difference, which only considers the structural differences and ignores the differences in the contents represented by the trees. For this reason, it suffers from the following two issues. First, for queries against a tree with a regular structure, it is difficult to differentiate subtrees in terms of structural similarity, yielding a large number of candidate results to verify. Second, the candidates are verified by computing the tree edit distance, which is cubic to the number of tree nodes. In this paper, we propose a solution for the subtree similarity search problem based on the structure and contents of the trees. We demonstrate through experiments that our proposed method outperforms the previous scan-based methods in terms of speed and is competitive with index-based methods.
引用
收藏
页码:72 / 87
页数:16
相关论文
共 50 条
  • [21] Pharmacophore Alignment Search Tool: Influence of the Third Dimension on Text-Based Similarity Searching
    Haehnke, Volker
    Klenner, Alexander
    Rippmann, Friedrich
    Schneider, Gisbert
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2011, 32 (08) : 1618 - 1634
  • [22] Text clustering based on asymmetric similarity
    School of Software, Tsinghua University, Beijing 100084, China
    Qinghua Daxue Xuebao, 2006, 7 (1325-1328):
  • [23] Statutes Recommendation Based on Text Similarity
    Zeng, Jin
    Ge, Jidong
    Zhou, Yemao
    Feng, Yi
    Li, Chuanyi
    Li, Zhongjin
    Luo, Bin
    2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017), 2017, : 201 - 204
  • [24] Text Similarity Based on Semantic Analysis
    Wang, Junli
    Zhou, Qing
    Sun, Guobao
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRIAL ENGINEERING (AIIE 2016), 2016, 133 : 303 - 307
  • [25] Semantic Based Text Similarity Computation
    Liu, Yaqi
    Li, Zhijiang
    ADVANCED GRAPHIC COMMUNICATIONS AND MEDIA TECHNOLOGIES, 2017, 417 : 343 - 348
  • [26] Software Text Semantic Search Approach Based on Code Structure Knowledge
    Lin Z.-Q.
    Zou Y.-Z.
    Zhao J.-F.
    Cao Y.-K.
    Xie B.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (12): : 3714 - 3729
  • [28] On the Subtree Size Profile of Binary Search trees
    Dennert, Florian
    Gruebel, Rudolf
    COMBINATORICS PROBABILITY & COMPUTING, 2010, 19 (04): : 561 - 578
  • [29] Proof of structure similarity as a function of spectral similarity in IR spectral search
    B. G. Derendyaev
    V. N. Piottukh-Peletskii
    Journal of Structural Chemistry, 1999, 40 : 165 - 166
  • [30] Proof of structure similarity as a function of spectral similarity in IR spectral search
    Derendyaev, BG
    Piottukh-Peletskii, VN
    JOURNAL OF STRUCTURAL CHEMISTRY, 1999, 40 (01) : 165 - 166