The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions

被引:17
|
作者
Hofmann, Hayden L. [1 ,3 ]
Guerra, Gage A. [1 ]
Le, Jonathan L. [1 ]
Wong, Alexander M. [1 ]
Hofmann, Grady H. [2 ]
Mayfield, Cory K. [1 ]
Petrigliano, Frank A. [1 ]
Liu, Joseph N. [1 ]
机构
[1] Keck Med USC, USC Epstein Family Ctr Sports Med, Los Angeles, CA USA
[2] Stanford Univ, Dept Biol, Palo Alto, CA USA
[3] Keck Med USC, USC Epstein Family Ctr Sports Med, 1520 San Pablo St 2000, Los Angeles, CA 90033 USA
关键词
D O I
10.3928/01477447-20230922-05
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Advances in artificial intelligence and machine learning models, like Chat Generative Pre-trained Transformer (ChatGPT), have occurred at a remarkably fast rate. OpenAI released its newest model of ChatGPT, GPT-4, in March 2023. It offers a wide range of medical applications. The model has demonstrated notable proficiency on many medical board examinations. This study sought to assess GPT-4's performance on the Orthopaedic In-Training Examination (OITE) used to prepare residents for the American Board of Orthopaedic Surgery (ABOS) Part I Examination. The data gathered from GPT-4's performance were additionally compared with the data of the previous iteration of ChatGPT, GPT-3.5, which was released 4 months before GPT-4. GPT-4 correctly answered 251 of the 396 attempted questions (63.4%), whereas GPT-3.5 correctly answered 46.3% of 410 attempted questions. GPT-4 was significantly more accurate than GPT-3.5 on orthopedic board-style questions (P<.00001). GPT-4's performance is most comparable to that of an average third-year orthopedic surgery resident, while GPT-3.5 performed below an average orthopedic intern. GPT-4's overall accuracy was just below the approximate threshold that indicates a likely pass on the ABOS Part I Examination. Our results demonstrate significant improvements in OpenAI's newest model, GPT-4. Future studies should assess potential clinical applications as AI models continue to be trained on larger data sets and offer more capabilities. [Orthopedics. 2024;47(2):e85 -e89.]
引用
收藏
页码:e85 / e89
页数:6
相关论文
共 50 条
  • [1] The Performance of Artificial Intelligence Chatbot (GPT-4) on Image-Based Dermatology Certification Board Exam Questions
    Samman, Luna
    Akuffo-Addo, Edgar
    Rao, Babar
    JOURNAL OF CUTANEOUS MEDICINE AND SURGERY, 2024, 28 (05) : 507 - 508
  • [2] GPT-4's Performance on the European Board of Interventional Radiology Sample Questions
    Besler, Muhammed Said
    CARDIOVASCULAR AND INTERVENTIONAL RADIOLOGY, 2024, 47 (05) : 683 - 684
  • [3] GPT-4's Performance on the European Board of Interventional Radiology Sample Questions
    Muhammed Said Beşler
    CardioVascular and Interventional Radiology, 2024, 47 : 683 - 684
  • [4] GPT-4, artificial intelligence and implications for publishing
    Ong, C. W. M.
    Blackbourn, H. D.
    Migiliori, G. B.
    INTERNATIONAL JOURNAL OF TUBERCULOSIS AND LUNG DISEASE, 2023, 27 (06) : 425 - 426
  • [5] GPT-4: a new era of artificial intelligence in medicine
    Waisberg, Ethan
    Ong, Joshua
    Masalkhi, Mouayad
    Kamran, Sharif Amit
    Zaman, Nasif
    Sarker, Prithul
    Lee, Andrew G.
    Tavakkoli, Alireza
    IRISH JOURNAL OF MEDICAL SCIENCE, 2023, 192 (06) : 3197 - 3200
  • [6] GPT-4: a new era of artificial intelligence in medicine
    Ethan Waisberg
    Joshua Ong
    Mouayad Masalkhi
    Sharif Amit Kamran
    Nasif Zaman
    Prithul Sarker
    Andrew G. Lee
    Alireza Tavakkoli
    Irish Journal of Medical Science (1971 -), 2023, 192 : 3197 - 3200
  • [7] ARTIFICIAL REASON AND ARTIFICIAL INTELLIGENCE: THE LEGAL REASONING CAPABILITIES OF GPT-4
    Spaic, Bojan
    Jovanovic, Miodrag
    ANNALS OF THE FACULTY OF LAW IN BELGRADE, 2024, 72 (03): : 383 - 422
  • [8] Artificial Intelligence in Ophthalmology: A Comparative Analysis of GPT-3.5, GPT-4, and Human Expertise in Answering StatPearls Questions
    Moshirfar, Majid
    Altaf, Amal W.
    Stoakes, Isabella M.
    Tuttle, Jared J.
    Hoopes, Phillip C.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (06)
  • [9] GPT-4 Artificial Intelligence Model Outperforms ChatGPT, Medical Students, and Neurosurgery Residents on Neurosurgery Written Board-Like Questions
    Guerra, Gage A.
    Hofmann, Hayden
    Sobhani, Sina
    Hofmann, Grady
    Gomez, David
    Soroudi, Daniel
    Hopkins, Benjamin S.
    Dallas, Jonathan
    Pangal, Dhiraj J.
    Cheok, Stephanie
    Nguyen, Vincent N.
    Mack, William J.
    Zada, Gabriel
    WORLD NEUROSURGERY, 2023, 179 : E160 - E165
  • [10] GPT-4: the future of artificial intelligence in medical school assessments
    Haruna-Cooper, Lois
    Rashid, Mohammed Ahmed
    JOURNAL OF THE ROYAL SOCIETY OF MEDICINE, 2023, 116 (06) : 218 - 219