Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data

被引:1
|
作者
Sobhgol, Sepideh Sadat [1 ]
Durand, Gabriel Campero [2 ]
Rauchhaupt, Lutz [1 ]
Saake, Gunter [2 ]
机构
[1] Ifak evMagdeburg, Magdeburg, Germany
[2] Otto von Guericke Univ, Magdeburg, Germany
关键词
Graph Analysis; Network Analysis; Link Prediction; Supervised Learning;
D O I
10.5220/0010381901590166
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the combination of data management and ML tools, a common problem is that ML frameworks might require moving the data outside of their traditional storage (i.e. databases), for model building. In such scenarios, it could be more effective to adopt some in-database statistical functionalities (Cohen et al., 2009). Such functionalities have received attention for relational databases, but unfortunately for graph-based database systems there are insufficient studies to guide users, either by clarifying the roles of the database or the pain points that require attention. In this paper we make an early feasibility consideration of such processing for a graph domain, prototyping on a state-of-the-art graph database (Neo4j) an in-database ML-driven case study on link prediction. We identify a general series of steps and a common-sense approach for database support. We find limited differences in most steps for the processing setups. suggesting a need for further evaluation. We identify bulk feature calculation as the most time consuming task, at both the model building and inference stages, and hence we define it as a focus area for improving how graph databases support ML workloads.
引用
收藏
页码:159 / 166
页数:8
相关论文
共 50 条
  • [1] Link Prediction of Weighted Triples for Knowledge Graph Completion Within the Scholarly Domain
    Nayyeri, Mojtaba
    Cil, Goekce Muege
    Vahdati, Sahar
    Osborne, Francesco
    Kravchenko, Andrey
    Angioni, Simone
    Salatino, Angelo
    Recupero, Diego Reforgiato
    Motta, Enrico
    Lehmann, Jens
    IEEE ACCESS, 2021, 9 : 116002 - 116014
  • [2] Link prediction of the knowledge graph in the CTD database
    Jeon, J.
    Woo, G.
    Kim, K.
    Cho, S.
    Shin, W.
    Kim, D.
    Choi, J.
    TOXICOLOGY LETTERS, 2024, 399 : S140 - S140
  • [3] Recommendation as Link Prediction: A Graph Kernel-based Machine Learning Approach
    Li, Xin
    Chen, Hsinchun
    JCDL 09: PROCEEDINGS OF THE 2009 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2009, : 213 - 216
  • [4] Link Prediction Based on Data Augmentation and Metric Learning Knowledge Graph Embedding
    Duan, Lijuan
    Han, Shengwen
    Jiang, Wei
    He, Meng
    Qiao, Yuanhua
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [5] Line graph contrastive learning for link prediction
    Zhang, Zehua
    Sun, Shilin
    Ma, Guixiang
    Zhong, Caiming
    PATTERN RECOGNITION, 2023, 140
  • [6] Recommendation as link prediction in bipartite graphs: A graph kernel-based machine learning approach
    Li, Xin
    Chen, Hsinchun
    DECISION SUPPORT SYSTEMS, 2013, 54 (02) : 880 - 890
  • [7] Machine learning for KPIs prediction: a case study of the overall equipment effectiveness within the automotive industry
    Choumicha EL Mazgualdi
    Tawfik Masrour
    Ibtissam El Hassani
    Abdelmoula Khdoudi
    Soft Computing, 2021, 25 : 2891 - 2909
  • [8] Machine learning for KPIs prediction: a case study of the overall equipment effectiveness within the automotive industry
    El Mazgualdi, Choumicha
    Masrour, Tawfik
    El Hassani, Ibtissam
    Khdoudi, Abdelmoula
    SOFT COMPUTING, 2021, 25 (04) : 2891 - 2909
  • [9] Scalable Graph Convolutional Network based Link Prediction on a Distributed Graph Database Server
    Karunarathna, Anuradha
    Senarath, Dinika
    Madhushanki, Shalika
    Weerakkody, Chinthaka
    Dayarathna, Miyuru
    Jayasena, Sanath
    Suzumura, Toyotaro
    2020 IEEE 13TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2020), 2020, : 107 - 115
  • [10] Tarragona Graph Database for Machine Learning Based on Graphs
    Rica, Elena
    Alvarez, Susana
    Serratosa, Francesc
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 302 - 310