SciSciNet: A large-scale open data lake for the science of science research

被引:40
|
作者
Lin, Zihang [1 ,2 ,3 ,4 ]
Yin, Yian [1 ,2 ,3 ,5 ]
Liu, Lu [1 ,2 ,3 ]
Wang, Dashun [1 ,2 ,3 ,5 ]
机构
[1] Northwestern Univ, Ctr Sci Sci & Innovat, Evanston, IL 60201 USA
[2] Northwestern Univ, Northwestern Inst Complex Syst, Evanston, IL 60201 USA
[3] Northwestern Univ, Kellogg Sch Management, Evanston, IL 60201 USA
[4] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[5] Northwestern Univ, McCormick Sch Engn, Evanston, IL 60201 USA
基金
美国国家科学基金会;
关键词
GENDER-DIFFERENCES; KNOWLEDGE TRANSFER; SOCIAL-SCIENCE; IMPACT; DISTRIBUTIONS; PUBLICATIONS; PRODUCTIVITY; TECHNOLOGY; CITATIONS; LINKAGE;
D O I
10.1038/s41597-023-02198-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The science of science has attracted growing research interests, partly due to the increasing availability of large-scale datasets capturing the innerworkings of science. These datasets, and the numerous linkages among them, enable researchers to ask a range of fascinating questions about how science works and where innovation occurs. Yet as datasets grow, it becomes increasingly difficult to track available sources and linkages across datasets. Here we present SciSciNet, a large-scale open data lake for the science of science research, covering over 134M scientific publications and millions of external linkages to funding and public uses. We offer detailed documentation of pre-processing steps and analytical choices in constructing the data lake. We further supplement the data lake by computing frequently used measures in the literature, illustrating how researchers may contribute collectively to enriching the data lake. Overall, this data lake serves as an initial but useful resource for the field, by lowering the barrier to entry, reducing duplication of efforts in data processing and measurements, improving the robustness and replicability of empirical claims, and broadening the diversity and representation of ideas in the field.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Large-scale dynamics: the state of the science, the state of the reef, and the research issues
    R. Buddemeier
    D. Fautin
    Coral Reefs, 2002, 21 : 1 - 8
  • [22] Large-scale dynamics: the state of the science, the state of the reef, and the research issues
    Buddemeier, RW
    Fautin, DG
    CORAL REEFS, 2002, 21 (01) : 1 - 8
  • [23] Large-scale science education intervention research we can use
    Penuel, William R.
    Fishman, Barry J.
    JOURNAL OF RESEARCH IN SCIENCE TEACHING, 2012, 49 (03) : 281 - 304
  • [24] CAPACITY OF SOCIAL SCIENCE ORGANIZATIONS TO PERFORM LARGE-SCALE EVALUATIVE RESEARCH
    WILLIAMS, W
    URBAN AFFAIRS QUARTERLY, 1972, 7 (04): : 431 - 472
  • [25] Large-scale deployment of distance education in computer science at the Hellenic Open University
    Xenos, Michalis
    Tsiatsos, Thrasyvoulos
    Vassiliadis, Bill
    INTERNATIONAL JOURNAL OF KNOWLEDGE AND LEARNING, 2008, 4 (2-3) : 285 - 297
  • [26] Efficient Graph Analytics in Python']Python for Large-Scale Data Science
    Zhou, Xiantian
    Ordonez, Carlos
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2021), 2021, 12925 : 158 - 164
  • [27] Materials science with large-scale data and informatics: Unlocking new opportunities
    Hill, Joanne
    Mulholland, Gregory
    Persson, Kristin
    Seshadri, Ram
    Wolverton, Chris
    Meredig, Bryce
    MRS BULLETIN, 2016, 41 (05) : 399 - 409
  • [28] Materials science with large-scale data and informatics: Unlocking new opportunities
    Joanne Hill
    Gregory Mulholland
    Kristin Persson
    Ram Seshadri
    Chris Wolverton
    Bryce Meredig
    MRS Bulletin, 2016, 41 : 399 - 409
  • [29] The Importance of Large-Scale Vision Science in Psychology, Neuroscience, and Computer Science
    Hebart, Martin N.
    Zheng, Charles Y.
    Dickter, Adam H.
    Kidder, Alexis
    Kwok, Wan Y.
    Corriveau, Anna
    Van Wicklin, Caitlin
    Pereira, Francisco
    Baker, Chris I.
    PERCEPTION, 2019, 48 : 5 - 5
  • [30] Open science and research data management
    Blasco Gil, Yolanda
    CUADERNOS DE HISTORIA CONTEMPORANEA, 2018, 40 : 461 - 463