Choosing experiments to accelerate collective discovery

被引:148
作者
Rzhetsky, Andrey [1 ,2 ,3 ,4 ,5 ]
Foster, Jacob G. [6 ]
Foster, Ian T. [3 ,4 ,7 ]
Evans, James A. [3 ,4 ,8 ]
机构
[1] Univ Chicago, Dept Med, Chicago, IL 60637 USA
[2] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[3] Univ Chicago, Computat Inst, Chicago, IL 60637 USA
[4] Argonne Natl Lab, Chicago, IL 60637 USA
[5] Univ Chicago, Inst Genom & Syst Biol, Chicago, IL 60637 USA
[6] Univ Calif Los Angeles, Dept Sociol, Los Angeles, CA 90095 USA
[7] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60637 USA
[8] Univ Chicago, Dept Sociol, Chicago, IL 60637 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
complex networks; computational biology; science of science; innovation; sociology of science; PROBLEM CHOICE; SMALL-WORLD; KNOWLEDGE; NETWORK; COLLABORATION; MECHANISMS; INNOVATION; SCIENCES; DYNAMICS; DIVISION;
D O I
10.1073/pnas.1509757112
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A scientist's choice of research problem affects his or her personal career trajectory. Scientists' combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity's importance corresponds to its degree centrality, and a problem's difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.
引用
收藏
页码:14569 / 14574
页数:6
相关论文
共 65 条
[1]   powerlaw: A Python']Python Package for Analysis of Heavy-Tailed Distributions [J].
Alstott, Jeff ;
Bullmore, Edward T. ;
Plenz, Dietmar .
PLOS ONE, 2014, 9 (01)
[2]  
[Anonymous], 2006, KDD
[3]  
[Anonymous], 1878, POPULAR SCI MONTHLY
[4]  
[Anonymous], 1986, Laboratory life: the construction of scientific facts
[5]  
[Anonymous], 2010, Networks, crowds, and markets
[6]  
[Anonymous], 1995, CONTINUOUS UNIVARIAT
[7]  
[Anonymous], 2002, Journal of Social Structure
[8]   Matthew: Effect or Fable? [J].
Azoulay, Pierre ;
Stuart, Toby ;
Wang, Yanbo .
MANAGEMENT SCIENCE, 2014, 60 (01) :92-109
[9]   Evolution of the social network of scientific collaborations [J].
Barabási, AL ;
Jeong, H ;
Néda, Z ;
Ravasz, E ;
Schubert, A ;
Vicsek, T .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2002, 311 (3-4) :590-614
[10]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512