Studying the Change Histories of Stack Overflow and GitHub Snippets

被引:8
|
作者
Manes, Saraj Singh [1 ]
Baysal, Olga [1 ]
机构
[1] Carleton Univ, Sch Comp Sci, Ottawa, ON, Canada
关键词
Code snippets; change history; evolution; Stack Overflow; GitHub; time series; co-change; code reuse; CODE; HARMFUL;
D O I
10.1109/MSR52588.2021.00040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stack Overflow is a popular Q&A forum for software developers, providing a large number of copyable code snippets. While GitHub is a collaborative development platform, developers often reuse Stack Overflow code in their GitHub projects. These snippets get revised or edited on each platform. In this work, we study Stack Overflow posts and the code snippets that are reused from these posts in GitHub projects. We investigate and compare the change history of SO snippets with the change history of GitHub snippets. We have applied a stratified random sampling when mining 440,000 GitHub projects to create a dataset representing the change history of the reused snippets; this dataset contains 22,900 GitHub projects, 33,765 Stack Overflow references mapped to 4,634 Stack Overflow posts, and a total of 73,322 commits. We analyze the evolution patterns of snippets on each platform, compare key trends and explore the co-change of these snippets. Our results demonstrate that 76% of snippets evolve on Stack Overflow, while only 22% of the reused code snippets evolve in GitHub. Stack Overflow snippets undergo fewer and smaller changes compared to their evolving counterparts on GitHub. The evolution of snippets on both platforms is driven by the original author of the content. Finally, we found that a small percentage of snippets is co-changing across two platforms, while snippets in GitHub and Stack Overflow evolve independently of one another.
引用
收藏
页码:283 / 294
页数:12
相关论文
共 50 条
  • [41] Programming Language Identification in Stack Overflow Post Snippets with Regex Based Tf-Idf Vectorization over ANN
    Swaraj, Aman
    Kumar, Sandeep
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2023, 2023, : 648 - 655
  • [42] Gistable: Evaluating the Executability of Python']Python Code Snippets on GitHub
    Horton, Eric
    Parnin, Chris
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 217 - 227
  • [43] Sentiment overflow in the testing stack: Analyzing software testing posts on Stack Overflow
    Swillus, Mark
    Zaidman, Andy
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 205
  • [44] A dataset of GitHub Actions workflow histories
    Cardoen, Guillaume
    Mens, Tom
    Decan, Alexandre
    2024 IEEE/ACM 21ST INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2024, : 677 - 681
  • [45] Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions
    Kabir, Samia
    Udo-Imeh, David N.
    Kou, Bonan
    Zhang, Tianyi
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [46] Seahawk: Stack Overflow in the IDE
    Ponzanelli, Luca
    Bacchelli, Alberto
    Lanza, Michele
    PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), 2013, : 1295 - 1298
  • [47] Community evolution on Stack Overflow
    Moutidis, Iraklis
    Williams, Hywel T. P.
    PLOS ONE, 2021, 16 (06):
  • [48] Code Duplication on Stack Overflow
    Baltes, Sebastian
    Treude, Christoph
    2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2020), 2020, : 13 - 16
  • [49] How Fast and Effectively Can Code Change History Enrich Stack Overflow?
    Nishinaka, Ryujiro
    Ubayashi, Naoyasu
    Kamei, Yasutaka
    Sato, Ryosuke
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020), 2020, : 467 - 478
  • [50] Studying the characteristics of AIOps projects on GitHub
    Roozbeh Aghili
    Heng Li
    Foutse Khomh
    Empirical Software Engineering, 2023, 28