ChatGPT and Bard Performance on the POSCOMP Exam

被引：0

作者：

Saldanha, Mateus Santos ^{[1
]}

Digiampietri, Luciano Antonio ^{[1
]}

机构：

[1] Univ Sao Paulo, Sao Paulo, SP, Brazil

来源：

PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年

关键词：

Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;

D O I：

10.1145/3658271.3658320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.

引用

页数：10

共 50 条

[31] Performance of "Bard", Google's Artificial Intelligence Chatbot, on Ophthalmology Board Exam Practice Questions
Botross, Monica
Mohammadi, Seyed Omid
Montgomery, Kendall
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
[32] Performance of ChatGPT-3.5, ChatGPT-4, Microsoft Copilot, and Google Bard To Identify Correct Information for Lung Cancer
Le, Hoa
Truong, Chi
PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 347 - 348
[33] ChatGPT: The End of Online Exam Integrity?
Susnjak, Teo
McIntosh, Timothy R.
EDUCATION SCIENCES, 2024, 14 (06):
[34] Performance of ChatGPT and Bard on the medical licensing examinations varies across different cultures: a comparison study
Chen, Yikai
Huang, Xiujie
Yang, Fangjie
Lin, Haiming
Lin, Haoyu
Zheng, Zhuoqun
Liang, Qifeng
Zhang, Jinhai
Li, Xinxin
BMC MEDICAL EDUCATION, 2024, 24 (01)
[35] AI-Powered Renal Diet Support: Performance of ChatGPT, Bard AI, and Bing Chat
Qarajeh, Ahmad
Tangpanithandee, Supawit
Thongprayoon, Charat
Suppadungsuk, Supawadee
Krisanapan, Pajaree
Aiumtrakul, Noppawit
Valencia, Oscar A. Garcia
Miao, Jing
Qureshi, Fawad
Cheungpasitporn, Wisit
CLINICS AND PRACTICE, 2023, 13 (05) : 1160 - 1172
[36] Performance assessment of ChatGPT 4, ChatGPT 3.5, Gemini Advanced Pro 1.5 and Bard 2.0 to problem solving in pathology in French language
Tarris, Georges
Martin, Laurent
DIGITAL HEALTH, 2025, 11
[37] News Verifiers Showdown: A Comparative Performance Evaluation of ChatGPT 3.5, ChatGPT 4.0, Bing AI, and Bard in News Fact-Checking
Caramancion, Kevin Matthe
2023 IEEE FUTURE NETWORKS WORLD FORUM, FNWF, 2024,
[38] ChatGPT or Bard: Who is a better Certified Ethical Hacker?
Raman, Raghu
Calyam, Prasad
Achuthan, Krishnashree
COMPUTERS & SECURITY, 2024, 140
[39] Assessing ChatGPT's orthopedic in-service training exam performance and applicability in the field
Jain, Neil
Gottlich, Caleb
Fisher, John
Campano, Dominic
Winston, Travis
JOURNAL OF ORTHOPAEDIC SURGERY AND RESEARCH, 2024, 19 (01)
[40] Is ChatGPT ready for primetime? Performance of artificial intelligence on a simulated Canadian urology board exam
Touma, Naji J.
Caterini, Jessica
Liblk, Kiera
CUAJ-CANADIAN UROLOGICAL ASSOCIATION JOURNAL, 2024, 18 (10): : 329 - 332

← 1 2 3 4 5 →