ChatGPT as Research Scientist: Probing GPT's capabilities as a Research Librarian, Research Ethicist, Data Generator, and Data Predictor

被引:3
|
作者
Lehr, Steven A. [1 ]
Caliskan, Aylin [2 ]
Liyanage, Suneragiri [3 ]
Banaji, Mahzarin R. [3 ]
机构
[1] Cangrade Inc, Watertown, MA 02472 USA
[2] Univ Washington, Informat Sch, Seattle, WA 98195 USA
[3] Harvard Univ, Dept Psychol, Cambridge, MA 02138 USA
关键词
generative AI; large language models; scientific methods; cognitive science;
D O I
10.1073/pnas.2404328121
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT- 3.5 and GPT- 4 across four central components of the scientific process: as using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT- 3.5 and GPT- 4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT- 4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT- 4 (though not GPT- 3.5) proved capable of detecting violations like p- hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered in large language corpora, indicating that ChatGPT can simulate known results, an antecedent to usefulness for both data generation and skills like hypothesis generation. Contrastingly, in Study 4 (Novel Data Predictor), neither model was successful at predicting new results absent in their training data, and neither appeared to leverage substantially new information when predicting more vs. less novel outcomes. Together, these results suggest that GPT is a flawed but rapidly improving librarian, a decent research ethicist already, capable of data generation in simple domains with known characteristics but poor at predicting novel patterns of empirical data to aid future experimentation.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Preferred but not Required: Examining Research Data Management Roles in Health Science Librarian Positions
    Bradley-Ridout, Glyneva
    JOURNAL OF THE CANADIAN HEALTH LIBRARIES ASSOCIATION, 2018, 39 (03): : 138 - 145
  • [32] Research on data processing technology based on virtual neutron signal generator
    Sun, Shangqing
    Huang, Qichang
    Wan, Bo
    Yang, Daibo
    Zhao, Yang
    Wang, Miao
    Xiong, Bangping
    Xi, Ge
    Luo, Yin
    Xia, Yuan
    Wei, Wenbin
    Miao, Yunxin
    JOURNAL OF INSTRUMENTATION, 2025, 20 (01):
  • [33] Navigating Tabular Data Synthesis Research Understanding User Needs and Tool Capabilities
    Davila, Maria F.
    Groen, Sven
    Panse, Fabian
    Wingerath, Wolfram
    SIGMOD RECORD, 2024, 53 (04) : 18 - 35
  • [34] AN ELECTRONIC MEDICAL RECORD SYSTEM WITH DIRECT DATA-ENTRY AND RESEARCH CAPABILITIES
    SALENIUS, SA
    MARGOLESEMALIN, L
    TEPPER, JE
    ROSENMAN, J
    VARIA, M
    HODGE, L
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 1992, 24 (02): : 369 - 376
  • [35] Research on opinion polarization by big data analytics capabilities in online social networks
    Xing, Yunfei
    Wang, Xiwei
    Qiu, Chengcheng
    Li, Yueqi
    He, Wu
    TECHNOLOGY IN SOCIETY, 2022, 68
  • [36] Research on the Evaluation of Regional Scientific and Technological Innovation Capabilities Driven by Big Data
    Liang, Kun
    Wu, Peng
    Zhang, Rui
    SUSTAINABILITY, 2024, 16 (04)
  • [37] Integrating AI language models in qualitative research: Replicating interview data analysis with ChatGPT
    Jalali, Mohammad S.
    Akhavan, Ali
    SYSTEM DYNAMICS REVIEW, 2024, 40 (03)
  • [38] ChatGPT for Univariate Statistics: Validation of AI-Assisted Data Analysis in Healthcare Research
    Ruta, Michael R.
    Gaidici, Tony
    Irwin, Chase
    Lifshitz, Jonathan
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2025, 27
  • [39] Can Chat Generative Pretraining Transformer (ChatGPT) Be Used for Statistical Analysis of Research Data?
    Jahangiri, Younes
    JOURNAL OF VASCULAR AND INTERVENTIONAL RADIOLOGY, 2023, 34 (12)
  • [40] An Overview of the US Army Research Laboratory's Sensor Information Testbed Collaborative Research Environment (SITCORE) and Automated Online Data Repository (AODR) Capabilities
    Ward, Dennis W.
    Bennett, Kelly W.
    GROUND/AIR MULTISENSOR INTEROPERABILITY, INTEGRATION, AND NETWORKING FOR PERSISTENT ISR VIII, 2017, 10190