ChatGPT and Bard Performance on the POSCOMP Exam

被引：0

作者：

Saldanha, Mateus Santos ^{[1
]}

Digiampietri, Luciano Antonio ^{[1
]}

机构：

[1] Univ Sao Paulo, Sao Paulo, SP, Brazil

来源：

PROCEEDINGS OF THE 20TH BRAZILIAN SYMPOSIUM ON INFORMATIONS SYSTEMS, SBSI 2024 | 2024年

关键词：

Large Language Model; ChatBot; Computer Science Examination; ChatGPT; Bard;

D O I：

10.1145/3658271.3658320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.

引用

页数：10

共 50 条

[21] ChatGPT's performance on JS']JSA-certified anesthesiologist exam
Kinoshita, Michiko
Komasaka, Mizuki
Tanaka, Katsuya
JOURNAL OF ANESTHESIA, 2024, 38 (02) : 282 - 283
[22] Performance of ChatGPT on Registered Nurse License Exam in Taiwan: A Descriptive Study
Huang, Huiman
HEALTHCARE, 2023, 11 (21)
[23] ChatGPT performance on the American Shoulder and Elbow Surgeons maintenance of certification exam
Fiedler, Benjamin
Azua, Eric N.
Phillips, Todd
Ahmed, Adil Shahzad
JOURNAL OF SHOULDER AND ELBOW SURGERY, 2024, 33 (09) : 1888 - 1893
[24] Performance of chatGPT on female pelvic medicine and reconstructive surgery PROLOG exam
Burgard, I.
Altshuler, P.
Muffly, T.
AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2024, 230 (04) : S1237 - S1238
[25] ChatGPT vs Bard: Which is a Better Writer?
Ng, Ai Leng
Ong, Justina
2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 307 - 312
[26] ChatGPT and Bard in Plastic Surgery: Hype or Hope?
Labouchere, Ania
Raffoul, Wassim
SURGERIES, 2024, 5 (01): : 37 - 48
[27] Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing
Sumbal, Anusha
Sumbal, Ramish
Amir, Alina
JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT, 2024, 11
[28] Can ChatGPT pass a nursing exam?
Allen, Chris
Woodnutt, Samuel
INTERNATIONAL JOURNAL OF NURSING STUDIES, 2023, 145
[29] Below average ChatGPT performance in medical microbiology exam compared to university students
Sallam, Malik
Al-Salahat, Khaled
FRONTIERS IN EDUCATION, 2023, 8
[30] A Radiation Oncology Board Exam of ChatGPT
Barbour, Andrew B.
Barbour, T. Aleksandr
CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (09)

← 1 2 3 4 5 →