GTED: Graph Traversal Edit Distance

被引:1
|
作者
Boroojeny, Ali Ebrahimpour [1 ]
Shrestha, Akash [1 ]
Sharifi-Zarchi, Ali [1 ,2 ,3 ]
Gallagher, Suzanne Renick [1 ]
Sahinalp, S. Cenk [4 ]
Chitsaz, Hamidreza [1 ]
机构
[1] Colorado State Univ, Ft Collins, CO 80523 USA
[2] Royan Inst, Tehran, Iran
[3] Sharif Univ Technol, Tehran, Iran
[4] Indiana Univ, Bloomington, IN USA
关键词
STRUCTURAL VARIATION; GENOME;
D O I
10.1007/978-3-319-89929-9_3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many problems in applied machine learning deal with graphs (also called networks), including social networks, security, web data mining, protein function prediction, and genome informatics. The kernel paradigm beautifully decouples the learning algorithm from the underlying geometric space, which renders graph kernels important for the aforementioned applications. In this paper, we give a new graph kernel which we call graph traversal edit distance (GTED). We introduce the GTED problem and give the first polynomial time algorithm for it. Informally, the graph traversal edit distance is the minimum edit distance between two strings formed by the edge labels of respective Eulerian traversals of the two graphs. Also, GTED is motivated by and provides the first mathematical formalism for sequence co-assembly and de novo variation detection in bioinformatics. We demonstrate that GTED admits a polynomial time algorithm using a linear program in the graph product space that is guaranteed to yield an integer solution. To the best of our knowledge, this is the first approach to this problem. We also give a linear programming relaxation algorithm for a lower bound on GTED. We use GTED as a graph kernel and evaluate it by computing the accuracy of an SVM classifier on a few datasets in the literature. Our results suggest that our kernel outperforms many of the common graph kernels in the tested datasets. As a second set of experiments, we successfully cluster viral genomes using GTED on their assembly graphs obtained from de novo assembly of next generation sequencing reads. Our GTED implementation can be downloaded from http://chitsazlab.org/software/gted/.
引用
收藏
页码:37 / 53
页数:17
相关论文
共 50 条
  • [31] Computing graph edit distance on quantum devices
    Incudini, Massimiliano
    Tarocco, Fabio
    Mengoni, Riccardo
    Di Pierro, Alessandra
    Mandarino, Antonio
    QUANTUM MACHINE INTELLIGENCE, 2022, 4 (02)
  • [32] Approximate Graph Edit Distance in Quadratic Time
    Riesen, Kaspar
    Ferrer, Miquel
    Bunke, Horst
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (02) : 483 - 494
  • [33] Iterative Bipartite Graph Edit Distance Approximation
    Riesen, Kaspar
    Dornberger, Rolf
    Bunke, Horst
    2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 61 - 65
  • [34] Self-organizing graph edit distance
    Neuhaus, M
    Bunke, H
    GRAPH BASED REPRESENTATIONS IN PATTERN RECOGNITION, PROCEEDINGS, 2003, 2726 : 83 - 94
  • [35] Graph Edit Distance Compacted Search Tree
    Chegrane, Ibrahim
    Hocine, Imane
    Yahiaoui, Said
    Bendjoudi, Ahcene
    Nouali-Taboudjemat, Nadia
    SIMILARITY SEARCH AND APPLICATIONS (SISAP 2022), 2022, 13590 : 181 - 189
  • [36] Computing Graph Edit Distance via Neural Graph Matching
    Piao, Chengzhi
    Xu, Tingyang
    Sun, Xiangguo
    Rong, Yu
    Zhao, Kangfei
    Cheng, Hong
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (08): : 1817 - 1829
  • [37] Comparing heuristics for graph edit distance computation
    David B. Blumenthal
    Nicolas Boria
    Johann Gamper
    Sébastien Bougleux
    Luc Brun
    The VLDB Journal, 2020, 29 : 419 - 458
  • [38] A graph edit distance based on node merging
    Berretti, S
    Del Bimbo, A
    Pala, P
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2004, 3115 : 464 - 472
  • [39] Improved Lower Bounds for Graph Edit Distance
    Blumenthal, David B.
    Gamper, Johann
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (03) : 503 - 516
  • [40] Comparing heuristics for graph edit distance computation
    Blumenthal, David B.
    Boria, Nicolas
    Gamper, Johann
    Bougleux, Sebastien
    Brun, Luc
    VLDB JOURNAL, 2020, 29 (01): : 419 - 458