Efficient and Effective Academic Expert Finding on Heterogeneous Graphs through (k, P)- Core based Embedding

被引:0
|
作者
Wang, Yuxiang [1 ]
Liu, Jun [1 ]
Xu, Xiaoliang [1 ]
Ke, Xiangyu [2 ]
Wu, Tianxing [3 ]
Gou, Xiaoxuan [1 ]
机构
[1] Hangzhou Dianzi Univ, 2 Ave, Hangzhou 310018, Zhejiang, Peoples R China
[2] Zhejiang Univ, 866 Yuhangtang Rd, Hangzhou 310058, Zhejiang, Peoples R China
[3] Southeast Univ, 2 Southeast Univ Rd, Nanjing 210096, Jiangsu, Peoples R China
关键词
Expert finding; (k; P)-core community; document/expert embedding; heterogeneous graph; MODELS;
D O I
10.1145/3578365
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Expert finding is crucial for a wealth of applications in both academia and industry. Given a user query and trove of academic papers, expert finding aims at retrieving the most relevant experts for the query, from the academic papers. Existing studies focus on embedding-based solutions that consider academic papers' textual semantic similarities to a query via document representation and extract the top-n experts fromthe most similar papers. Beyond implicit textual semantics, however, papers' explicit relationships (e.g., co-authorship) in a heterogeneous graph (e.g., DBLP) are critical for expert finding, because they help improve the representation quality. Despite their importance, the explicit relationships of papers generally have been ignored in the literature. In this article, we study expert finding on heterogeneous graphs by considering both the explicit relationships and implicit textual semantics of papers in one model. Specifically, we define the cohesive (k, P)-core community of papersw.r.t. a meta-path P (i.e., relationship) and propose a (k, P)-core based document embedding model to enhance the representation quality. Based on this, we design a proximity graph-based index (PGIndex) of papers and present a threshold algorithm (TA)-based method to efficiently extract top-n experts from papers returned by PG-Index. We further optimize our approach in two ways: (1) we boost effectiveness by considering the (k, P)-core community of experts and the diversity of experts' research interests, to achieve high-quality expert representation from paper representation; and (2) we streamline expert finding, going from "extract top-n experts fromtop-m (m > n) semantically similar papers" to "directly return top-n experts". The process of returning a large number of top-m papers as intermediate data is avoided, thereby improving the efficiency. Extensive experiments using real-world datasets demonstrate our approach's superiority.
引用
收藏
页数:35
相关论文
共 6 条
  • [1] Academic Expert Finding via (k, P)-Core based Embedding over Heterogeneous Graphs
    Xu, Xiaoliang
    Liu, Jun
    Wang, Yuxiang
    Ke, Xiangyu
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 338 - 351
  • [2] Efficient and Effective Expert Finding based on Community Search: A Demonstration
    Du, Chengyu
    Gou, Xiaoxuan
    Wang, Yuxiang
    Xu, Xiaoliang
    2022 TENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA, CBD, 2022, : 91 - 97
  • [3] Topic-sensitive expert finding based solely on heterogeneous academic networks
    Gao, Xiaonan
    Wu, Sen
    Xia, Dawen
    Xiong, Hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [4] Efficient and Effective Multi-Modal Queries Through Heterogeneous Network Embedding
    Chi Thang Duong
    Thanh Tam Nguyen
    Yin, Hongzhi
    Weidlich, Matthias
    Mai, Thai Son
    Aberer, Karl
    Quoc Viet Hung Nguyen
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5307 - 5320
  • [5] Efficient and effective (k, TP)-core-based community search over attributed heterogeneous information networks
    Wang, Yuxiang
    Gu, Chengjie
    Xu, Xiaoliang
    Zeng, Xinjun
    Ke, Xiangyu
    Wu, Tianxing
    INFORMATION SCIENCES, 2024, 661
  • [6] Effective Early Detection of Epileptic Seizures through EEG Signals Using Classification Algorithms Based on t-Distributed Stochastic Neighbor Embedding and K-Means
    Alalayah, Khaled M. M.
    Senan, Ebrahim Mohammed
    Atlam, Hany F. F.
    Ahmed, Ibrahim Abdulrab
    Shatnawi, Hamzeh Salameh Ahmad
    DIAGNOSTICS, 2023, 13 (11)