Whois? Deep Author Name Disambiguation Using Bibliographic Data

被引:7
|
作者
Boukhers, Zeyd [1 ,2 ]
Asundi, Nagaraj Bahubali [1 ]
机构
[1] Univ Koblenz Landau, Inst Web Sci & Technol WeST, Koblenz, Germany
[2] Fraunhofer Inst Appl Informat Technol, St Augustin, Germany
关键词
Author name disambiguation; Entity linkage; Bibliographic data; Neural networks; Classification;
D O I
10.1007/978-3-031-16802-4_16
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the number of authors is increasing exponentially over years, the number of authors sharing the same names is increasing proportionally. This makes it challenging to assign newly published papers to their adequate authors. Therefore, Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries. This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities by leveraging their co-authors and domain of research. To this end, we use a collection from the DBLP repository that contains more than 5 million bibliographic records authored by around 2.6 million co-authors. Our approach first groups authors who share the same last names and same first name initials. The author within each group is identified by capturing the relation with his/her co-authors and area of research, which is represented by the titles of the validated publications of the corresponding author. To this end, we train a neural network model that learns from the representations of the co-authors and titles. We validated the effectiveness of our approach by conducting extensive experiments on a large dataset.
引用
收藏
页码:201 / 215
页数:15
相关论文
共 50 条
  • [1] Deep author name disambiguation using DBLP data
    Boukhers, Zeyd
    Asundi, Nagaraj Bahubali
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2024, 25 (03) : 431 - 441
  • [2] Off-the-shelf Semantic Author Name Disambiguation for Bibliographic Data Bases
    Mueller, Mark-Christoph
    Bannister, Adam
    Reitz, Florian
    DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2019, 2019, 11799 : 397 - 400
  • [3] Author Name Disambiguation by Using Deep Neural Network
    Hung Nghiep Tran
    Tin Huynh
    Tien Do
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT 1, 2014, 8397 : 123 - 132
  • [4] A review of author name disambiguation techniques for the PubMed bibliographic database
    Sanyal, Debarshi Kumar
    Bhowmick, Plaban Kumar
    Das, Partha Pratim
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 227 - 254
  • [5] Author Name Disambiguation for Citations on the Deep Web
    Zhang, Rui
    Shen, Derong
    Kou, Yue
    Nie, Tiezheng
    WEB-AGE INFORMATION MANAGEMENT, 2010, 6185 : 198 - 209
  • [6] Author name disambiguation using a graph model with node splitting and merging based on bibliographic information
    Shin, Dongwook
    Kim, Taehwan
    Choi, Joongmin
    Kim, Jungsun
    SCIENTOMETRICS, 2014, 100 (01) : 15 - 50
  • [7] Author name disambiguation using a graph model with node splitting and merging based on bibliographic information
    Dongwook Shin
    Taehwan Kim
    Joongmin Choi
    Jungsun Kim
    Scientometrics, 2014, 100 : 15 - 50
  • [8] Hybrid Deep Pairwise Classification for Author Name Disambiguation
    Kim, Kunho
    Rohatgi, Shaurya
    Giles, C. Lee
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2369 - 2372
  • [9] Author Name Disambiguation
    Smalheiser, Neil R.
    Torvik, Vetle I.
    ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2009, 43 : 287 - 313
  • [10] Author Name Disambiguation for Ranking and Clustering PubMed Data Using NetClus
    Varadharajalu, Arvin
    Liu, Wei
    Wong, Wilson
    AI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7106 : 152 - +