ChatGPT and Bard Performance on the POSCOMP Exam

被引:0
|
作者
Saldanha, Mateus Santos [1 ]
Digiampietri, Luciano Antonio [1 ]
机构
[1] Univ Sao Paulo, Sao Paulo, SP, Brazil
关键词
Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;
D O I
10.1145/3658271.3658320
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Performance of ChatGPT on ACC/SCAI Interventional Cardiology Certi fi cation Simulation Exam
    Alexandrou, Michaella
    Mahtani, Arun Umesh
    Rempakos, Athanasios
    Mutlu, Deniz
    Al Ogaili, Ahmed
    Gill, Gauravpal Singh
    Sharma, Aditi
    Prasad, Anand
    Mastrodemos, Olga C.
    Sandoval, Yader
    Brilakis, Emmanouil S.
    JACC-CARDIOVASCULAR INTERVENTIONS, 2024, 17 (10) : 1292 - 1293
  • [42] Assessing ChatGPT’s orthopedic in-service training exam performance and applicability in the field
    Neil Jain
    Caleb Gottlich
    John Fisher
    Dominic Campano
    Travis Winston
    Journal of Orthopaedic Surgery and Research, 19
  • [43] Performance of ChatGPT and GPT-4 on Polish National Specialty Exam (NSE) in Ophthalmology
    Ciekalski, Marcin
    Laskowski, Maciej
    Koperczak, Agnieszka
    Smierciak, Maria
    Sirek, Sebastian
    POSTEPY HIGIENY I MEDYCYNY DOSWIADCZALNEJ, 2024, 78 (01): : 111 - 116
  • [44] Does ChatGPT Pass the Brazilian Bar Exam?
    Freitas, Pedro Miguel
    Gomes, Luis Mendes
    PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II, 2023, 14116 : 131 - 141
  • [45] Can ChatGPT pass the thoracic surgery exam?
    Gencer, Adem
    Aydin, Suphi
    AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2023, 366 (04): : 291 - 295
  • [46] Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard
    Cheong, Ryan Chin Taw
    Pang, Kenny Peter
    Unadkat, Samit
    Mcneillis, Venkata
    Williamson, Andrew
    Joseph, Jonathan
    Randhawa, Premjit
    Andrews, Peter
    Paleri, Vinidh
    EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (04) : 2137 - 2143
  • [47] Performance evaluation of ChatGPT, GPT-4, and Bard on the official board examination of the Japan Radiology Society
    Toyama, Yoshitaka
    Harigai, Ayaka
    Abe, Mirei
    Nagano, Mitsutoshi
    Kawabata, Masahiro
    Seki, Yasuhiro
    Takase, Kei
    JAPANESE JOURNAL OF RADIOLOGY, 2023, 42 (2) : 201 - 207
  • [48] Performance evaluation of ChatGPT, GPT-4, and Bard on the official board examination of the Japan Radiology Society
    Yoshitaka Toyama
    Ayaka Harigai
    Mirei Abe
    Mitsutoshi Nagano
    Masahiro Kawabata
    Yasuhiro Seki
    Kei Takase
    Japanese Journal of Radiology, 2024, 42 : 201 - 207
  • [49] Performance of Large Language Models (ChatGPT, Bing Search, and Google Bard) in Solving Case Vignettes in Physiology
    Dhanvijay, Anup Kumar D.
    Pinjar, Mohammed Jaffer
    Dhokane, Nitin
    Sorte, Smita R.
    Kumari, Amita
    Mondal, Himel
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)
  • [50] Comparing the performance of ChatGPT-3.5-Turbo, ChatGPT-4, and Google Bard with Iranian students in pre-internship comprehensive exams
    Zare, Soolmaz
    Vafaeian, Soheil
    Amini, Mitra
    Farhadi, Keyvan
    Vali, Mohammadreza
    Golestani, Ali
    SCIENTIFIC REPORTS, 2024, 14 (01):