MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks

被引:41
|
作者
Sun, Xiaobing [1 ,2 ]
Li, Bixin [3 ]
Leung, Hareton [4 ]
Li, Bin [1 ]
Li, Yun [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210008, Jiangsu, Peoples R China
[3] Southeast Univ, Sch Engn & Comp Sci, Nanjing, Jiangsu, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
Software maintenance; Mining software historical repositories; Topic model; Empirical study; CHANGE IMPACT ANALYSIS; INFORMATION-RETRIEVAL; FEATURE LOCATION; SOURCE CODE; TAXONOMY;
D O I
10.1016/j.infsof.2015.05.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results. Objective: This article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks. Method: For a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics. Results: MSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM. Conclusions: There is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [31] Mining software repositories for adaptive change commits using machine learning techniques
    Megdadi, Omar
    Alhindawi, Nouh
    Alsakran, Jamal
    Saifan, Ahmad
    Migdadi, Hatim
    INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 109 : 80 - 91
  • [32] VA4SM: A Visual Analytics Tool for Software Maintenance
    Reddivari, Sandeep
    Liu, Kaihua
    Reddy, Reyansh
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 1002 - 1003
  • [33] Improving Software Maintenance Using Process Mining and Predictive Analytics
    Gupta, Monika
    Serebrenik, Alexander
    Jalote, Pankaj
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 681 - 686
  • [34] Using Alloy to Support Feature-Based DSL Construction for Mining Software Repositories
    Huang, Changyun
    Kamei, Yasutaka
    Yamashita, Kazuhiro
    Ubayashi, Naoyasu
    PROCEEDINGS OF THE 17TH INTERNATIONAL SOFTWARE PRODUCT LINE CONFERENCE CO-LOCATED WORKSHOPS (SPLC'13 WORKSHOPS), 2013, : 86 - 89
  • [35] Predicting Software Maintenance Effort by Mining Software Project Reports Using Inter-Version Validation
    Jindal, Rajni
    Malhotra, Ruchika
    Jain, Abha
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2016, 23 (06)
  • [36] Cost Adjustment for Software Crowdsourcing Tasks Using Ensemble Effort Estimation and Topic Modeling
    Yasmin, Anum
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 12693 - 12728
  • [37] Creating and Analyzing Source Code Repository Models A Model-based Approach to Mining Software Repositories
    Scheidgen, Markus
    Smidt, Martin
    Fischer, Joachim
    MODELSWARD: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON MODEL-DRIVEN ENGINEERING AND SOFTWARE DEVELOPMENT, 2017, : 329 - 336
  • [38] Extracting software static defect models using data mining
    Yousef, Ahmed H.
    AIN SHAMS ENGINEERING JOURNAL, 2015, 6 (01) : 133 - 144
  • [39] Improving Recall of software defect prediction models using association mining
    Rana, Zeeshan Ali
    Mian, M. Awais
    Shamail, Shafay
    KNOWLEDGE-BASED SYSTEMS, 2015, 90 : 1 - 13
  • [40] An Exploratory Evaluation of Large Language Models Using Empirical Software Engineering Tasks
    Liang, Wenjun
    Xiao, Guanping
    PROCEEDINGS OF THE 15TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2024, 2024, : 31 - 40