Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions

被引：8

作者：

Kabir, Samia ^{[1
]}

Udo-Imeh, David N. ^{[1
]}

Kou, Bonan ^{[1
]}

Zhang, Tianyi ^{[1
]}

机构：

[1] Purdue Univ, W Lafayette, IN 47907 USA

来源：

PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024) | 2024年

关键词：

stack overflow; q&a; large language model; chatgpt; misinformation;

D O I：

10.1145/3613904.3642596

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Q&A platforms have been crucial for the online help-seeking behavior of programmers. However, the recent popularity of ChatGPT is altering this trend. Despite this popularity, no comprehensive study has been conducted to evaluate the characteristics of ChatGPT's answers to programming questions. To bridge the gap, we conducted the first in-depth analysis of ChatGPT answers to 517 programming questions on Stack Overflow and examined the correctness, consistency, comprehensiveness, and conciseness of ChatGPT answers. Furthermore, we conducted a large-scale linguistic analysis, as well as a user study, to understand the characteristics of ChatGPT answers from linguistic and human aspects. Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose. Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style. However, they also overlooked the misinformation in the ChatGPT answers 39% of the time. This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.

引用

页数：17

共 50 条

[41] The reproducibility of programming-related issues in Stack Overflow questions
Mondal, Saikat
Rahman, Mohammad Masudur
Roy, Chanchal K.
Schneider, Kevin
EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (03)
[42] Code Duplication on Stack Overflow
Baltes, Sebastian
Treude, Christoph
2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2020), 2020, : 13 - 16
[43] Insights on Apache Spark Usage by Mining Stack Overflow Questions
Rodriguez, Leonardo Jimenez
Wang, Xiaoran
Kuang, Jilong
2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 219 - 223
[44] Automatic Voter Recommendation Method for Closing Questions in Stack Overflow
Zhang, Zhang
Mao, Xinjun
Lu, Yao
Lu, Jinyu
Yu, Yue
Li, Zhixing
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (11-12) : 1707 - 1733
[45] How Developers and Tools Categorize Sentiment in Stack Overflow Questions - A Pilot Study
Mansoor, Niloofar
Peterson, Cole S.
Sharif, Bonita
2021 IEEE/ACM SIXTH INTERNATIONAL WORKSHOP ON EMOTION AWARENESS IN SOFTWARE ENGINEERING SEMOTION 2021, 2021, : 19 - 22
[46] A study on classifying Stack Overflow questions based on difficulty by utilizing contextual features
Raida, Maliha Noushin
Sristy, Zannatun Naim
Ulfat, Nawshin
Monisha, Sheikh Moonwara Anjum
Mostafa, Md. Jubair Ibna
Haque, Md. Nazmul
JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 208
[47] An Exploratory Study for GUI Posts on Stack Overflow
Ding, Jing
Nie, Liming
Liu, Yang
Ding, Zuohua
Xuan, Jifeng
2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 1113 - 1124
[48] Do Subjectivity and Objectivity Always Agree? A Case Study with Stack Overflow Questions
Mondal, Saikat
Rahman, Mohammad Masudur
Roy, Chanchal K.
2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 389 - 401
[49] An Empirical Study of Web Services Topics in Web Developer Discussions on Stack Overflow
Mahmood, Khalid
Rasool, Ghulam
Sabir, Fatima
Athar, Atifa
IEEE ACCESS, 2023, 11 : 9627 - 9655
[50] API Topics Issues in Stack Overflow Q&As Posts: An Empirical Study
Ajam, George
Rodriguez, Carlos
Benatallah, Boualem
2020 XLVI LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2020), 2021, : 147 - 155

← 1 2 3 4 5 →