A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis

被引:53
|
作者
Saeed-Ul Hassan [1 ]
Safder, Iqra [1 ]
Akram, Anam [1 ]
Kamiran, Faisal [1 ]
机构
[1] Informat Technol Univ, 346-B,Ferozepur Rd, Lahore 54700, Pakistan
关键词
Knowledge flows; Machine learning; Citation context classification; Influential citations; Citation analysis; INFORMATION-SCIENCE; PATENT CITATIONS; INSTITUTIONS; SPECIALTY; DIFFUSION; SPACE; US;
D O I
10.1007/s11192-018-2767-x
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We measure the knowledge flows between countries by analysing publication and citation data, arguing that not all citations are equally important. Therefore, in contrast to existing techniques that utilize absolute citation counts to quantify knowledge flows between different entities, our model employs a citation context analysis technique, using a machine-learning approach to distinguish between important and non-important citations. We use 14 novel features (including context-based, cue words-based and text-based) to train a Support Vector Machine (SVM) and Random Forest classifier on an annotated dataset of 20,527 publications downloaded from the Association for Computational Linguistics anthology (http://allenai.org/data.html). Our machine-learning models outperform existing state-of-the-art citation context approaches, with the SVM model reaching up to 61% and the Random Forest model up to a very encouraging 90% Precision-Recall Area Under the Curve, with 10-fold cross-validation. Finally, we present a case study to explain our deployed method for datasets of PLoS ONE full-text publications in the field of Computer and Information Sciences. Our results show that a significant volume of knowledge flows from the United States, based on important citations, are consumed by the international scientific community. Of the total knowledge flow from China, we find a relatively smaller proportion (only 4.11%) falling into the category of knowledge flow based on important citations, while The Netherlands and Germany show the highest proportions of knowledge flows based on important citations, at 9.06 and 7.35% respectively. Among the institutions, interestingly, the findings show that at the University of Malaya more than 10% of the knowledge produced falls into the category of important. We believe that such analyses are helpful to understand the dynamics of the relevant knowledge flows across nations and institutions.
引用
收藏
页码:973 / 996
页数:24
相关论文
共 50 条
  • [1] A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis
    Saeed-Ul Hassan
    Iqra Safder
    Anam Akram
    Faisal Kamiran
    Scientometrics, 2018, 116 : 973 - 996
  • [2] Measuring Scientific Knowledge Flows by Deploying Citation Context Analysis using Machine Learning Approach on PLoS ONE Full Text
    Saeed-Ul Hassan
    Akram, Anam
    Asghar, Awais
    Aljohani, Naif Radi
    16TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI 2017), 2017, : 322 - 333
  • [3] A machine-learning based approach to measuring constructs through text analysis
    Tsao, Hsiu-Yuan
    Campbell, Colin L.
    Sands, Sean
    Ferraro, Carla
    Mavrommatis, Alexis
    Lu, Steven
    EUROPEAN JOURNAL OF MARKETING, 2019, 54 (03) : 511 - 524
  • [4] Scientific VS Non-Scientific Citation Annotational Complexity Analysis using Machine Learning Classifiers
    Raza, Hassan
    Faizan, M.
    Akhtar, Naeem
    Abbas, Ayesha
    Naveed-Ul-Hassan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 210 - 213
  • [5] APPLYING METRICS TO MACHINE-LEARNING TOOLS - A KNOWLEDGE ENGINEERING APPROACH
    ALONSO, F
    MATE, L
    JURISTO, N
    MUNOZ, PL
    PAZOS, J
    AI MAGAZINE, 1994, 15 (03) : 63 - 75
  • [6] A Machine-Learning Based Approach for Measuring the Completeness of Online Privacy Policies
    Guntamukkala, Niharika
    Dara, Rozita
    Grewal, Gary
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 289 - 294
  • [7] A Collaborative Approach Toward Scientific Paper Recommendation Using Citation Context
    Sakib, Nazmus
    Ahmad, Rodina Binti
    Haruna, Khalid
    IEEE ACCESS, 2020, 8 : 51246 - 51255
  • [8] Machine-Learning Approach to Analysis of Driving Simulation Data
    Yoshizawa, Akira
    Nishiyama, Hiroyuki
    Iwasaki, Hirotoshi
    Mizoguchi, Fumio
    2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 398 - 402
  • [9] A novel machine-learning based approach to predict flares of psoriasis
    Ramelyte, E.
    Djamei, V.
    Maul, T. J.
    Anzengruber, F.
    Navarini, A.
    EXPERIMENTAL DERMATOLOGY, 2018, 27 (03) : E44 - E45
  • [10] The Translational Machine: A novel machine-learning approach to illuminate complex genetic architectures
    Askland, Kathleen D.
    Strong, David
    Wright, Marvin N.
    Moore, Jason H.
    GENETIC EPIDEMIOLOGY, 2021, 45 (05) : 485 - 536