Studying the Change Histories of Stack Overflow and GitHub Snippets

被引:8
|
作者
Manes, Saraj Singh [1 ]
Baysal, Olga [1 ]
机构
[1] Carleton Univ, Sch Comp Sci, Ottawa, ON, Canada
关键词
Code snippets; change history; evolution; Stack Overflow; GitHub; time series; co-change; code reuse; CODE; HARMFUL;
D O I
10.1109/MSR52588.2021.00040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stack Overflow is a popular Q&A forum for software developers, providing a large number of copyable code snippets. While GitHub is a collaborative development platform, developers often reuse Stack Overflow code in their GitHub projects. These snippets get revised or edited on each platform. In this work, we study Stack Overflow posts and the code snippets that are reused from these posts in GitHub projects. We investigate and compare the change history of SO snippets with the change history of GitHub snippets. We have applied a stratified random sampling when mining 440,000 GitHub projects to create a dataset representing the change history of the reused snippets; this dataset contains 22,900 GitHub projects, 33,765 Stack Overflow references mapped to 4,634 Stack Overflow posts, and a total of 73,322 commits. We analyze the evolution patterns of snippets on each platform, compare key trends and explore the co-change of these snippets. Our results demonstrate that 76% of snippets evolve on Stack Overflow, while only 22% of the reused code snippets evolve in GitHub. Stack Overflow snippets undergo fewer and smaller changes compared to their evolving counterparts on GitHub. The evolution of snippets on both platforms is driven by the original author of the content. Finally, we found that a small percentage of snippets is co-changing across two platforms, while snippets in GitHub and Stack Overflow evolve independently of one another.
引用
收藏
页码:283 / 294
页数:12
相关论文
共 50 条
  • [31] An empirical study of code reuse between GitHub and stack overflow during software development
    Chen, Xiangping
    Xu, Furen
    Huang, Yuan
    Zhou, Xiaocong
    Zheng, Zibin
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 210
  • [32] Exploring the problems, their causes and solutions of AI pair programming: A study on GitHub and Stack Overflow
    Zhou, Xiyu
    Liang, Peng
    Zhang, Beiqi
    Li, Zengyang
    Ahmad, Aakash
    Shahin, Mojtaba
    Waseem, Muhammad
    JOURNAL OF SYSTEMS AND SOFTWARE, 2025, 219
  • [33] The State of Practice on Virtual Reality (VR) Applications: an Exploratory Study on Github and Stack Overflow
    Ghrairi, Naoures
    Kpodjedo, Segla
    Barrak, Amine
    Petrillo, Fabio
    Khomh, Foutse
    2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2018), 2018, : 356 - 366
  • [34] Software development during COVID-19 pandemic: An analysis of stack overflow and GitHub
    de Oliveira, Pedro Almir Martins
    dos Santos Neto, Pedro de Alcântara
    Silva, Gleison
    Ibiapina, Irvayne
    Lira, Werney L.
    de Castro Andrade, Rossana Maria
    arXiv, 2021,
  • [35] DICOS: Discovering Insecure Code Snippets from Stack Overflow Posts by Leveraging User Discussions
    Hong, Hyunji
    Woo, Seunghoon
    Lee, Heejo
    37TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2021, 2021, : 194 - 206
  • [36] Code2Que: A Tool for Improving Question Titles from Mined Code Snippets in Stack Overflow
    Gao, Zhipeng
    Xia, Xin
    Lo, David
    Grundy, John
    Li, Yuan-Fang
    PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, : 1525 - 1529
  • [37] Code2Que: A tool for improving question titles from mined code snippets in stack overflow
    Gao, Zhipeng
    Xia, Xin
    Lo, David
    Grundy, John
    Li, Yuan-Fang
    ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021, : 1525 - 1529
  • [38] iDev: Enhancing Social Coding Security by Cross-platform User Identification Between GitHub and Stack Overflow
    Fan, Yujie
    Zhang, Yiming
    Hou, Shifu
    Chen, Lingwei
    Ye, Yanfang
    Shi, Chuan
    Zhao, Liang
    Xu, Shouhuai
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2272 - 2278
  • [39] Studying Developer Reading Behavior on Stack Overflow during API Summarization Tasks
    Saddler, Jonathan A.
    Peterson, Cole S.
    Sama, Sanjana
    Nagaraj, Shruthi
    Baysal, Olga
    Guerrouj, Latifa
    Sharif, Bonita
    PROCEEDINGS OF THE 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER '20), 2020, : 195 - 205