Author name disambiguation using a graph model with node splitting and merging based on bibliographic information

被引:2
|
作者
Dongwook Shin
Taehwan Kim
Joongmin Choi
Jungsun Kim
机构
[1] Hanyang University,Department of Computer Science and Engineering
来源
Scientometrics | 2014年 / 100卷
关键词
Author name disambiguation; Graph model; Namesake resolution; Heteronymous name resolution; Digital library;
D O I
暂无
中图分类号
学科分类号
摘要
Author ambiguity mainly arises when several different authors express their names in the same way, generally known as the namesake problem, and also when the name of an author is expressed in many different ways, referred to as the heteronymous name problem. These author ambiguity problems have long been an obstacle to efficient information retrieval in digital libraries, causing incorrect identification of authors and impeding correct classification of their publications. It is a nontrivial task to distinguish those authors, especially when there is very limited information about them. In this paper, we propose a graph based approach to author name disambiguation, where a graph model is constructed using the co-author relations, and author ambiguity is resolved by graph operations such as vertex (or node) splitting and merging based on the co-authorship. In our framework, called a Graph Framework for Author Disambiguation (GFAD), the namesake problem is solved by splitting an author vertex involved in multiple cycles of co-authorship, and the heteronymous name problem is handled by merging multiple author vertices having similar names if those vertices are connected to a common vertex. Experiments were carried out with the real DBLP and Arnetminer collections and the performance of GFAD is compared with three representative unsupervised author name disambiguation systems. We confirm that GFAD shows better overall performance from the perspective of representative evaluation metrics. An additional contribution is that we released the refined DBLP collection to the public to facilitate organizing a performance benchmark for future systems on author disambiguation.
引用
收藏
页码:15 / 50
页数:35
相关论文
共 50 条
  • [41] A novel approach for author name disambiguation using ranking confidence
    Lin, Xueqin
    Zhu, Jia
    Tang, Yong
    Yang, Fen
    Peng, Bo
    Li, Weiling
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, 10179 LNCS : 169 - 182
  • [42] Author Name Disambiguation for Citations Using Topic and Web Correlation
    Yang, Kai-Hsiang
    Peng, Hsin-Tsung
    Jiang, Jian-Yi
    Lee, Hahn-Ming
    Ho, Jan-Ming
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, 2008, 5173 : 185 - +
  • [43] A hybrid knowledge-based framework for author name disambiguation
    Protasiewicz, Jaroslaw
    Dadas, Slawomir
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 594 - 600
  • [44] A Network Maximum Flow Based Approach for Author Name Disambiguation
    Quan J.
    Fu L.
    Gan X.
    Wang X.
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2020, 54 (02): : 111 - 116
  • [45] The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings
    Farber, Michael
    Ao, Lin
    QUANTITATIVE SCIENCE STUDIES, 2022, 3 (01): : 51 - 98
  • [46] Incremental author name disambiguation using author profile models and self-citations
    Hussain, Ijaz
    Asghar, Sohail
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (05) : 3665 - 3681
  • [47] A probabilistic similarity metric for Medline records: A model for author name disambiguation
    Torvik, VI
    Weeber, M
    Swanson, DR
    Smalheiser, NR
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (02): : 140 - 158
  • [48] An Unsupervised Heuristic-Based Hierarchical Method for Name Disambiguation in Bibliographic Citations
    Cota, Ricardo G.
    Ferreira, Anderson A.
    Nascimento, Cristiano
    Goncalves, Marcos Andre
    Laender, Alberto H. F.
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (09): : 1853 - 1870
  • [49] Name Disambiguation Based on Similar Features and Relation Graph Optimization
    Cui H.
    Yang J.
    Song W.
    Data Analysis and Knowledge Discovery, 2023, 7 (05) : 71 - 80
  • [50] Using node merging to enhance graph coloring
    Vegdahl, SR
    ACM SIGPLAN NOTICES, 1999, 34 (05) : 150 - 154