MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks

被引:41
|
作者
Sun, Xiaobing [1 ,2 ]
Li, Bixin [3 ]
Leung, Hareton [4 ]
Li, Bin [1 ]
Li, Yun [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210008, Jiangsu, Peoples R China
[3] Southeast Univ, Sch Engn & Comp Sci, Nanjing, Jiangsu, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
Software maintenance; Mining software historical repositories; Topic model; Empirical study; CHANGE IMPACT ANALYSIS; INFORMATION-RETRIEVAL; FEATURE LOCATION; SOURCE CODE; TAXONOMY;
D O I
10.1016/j.infsof.2015.05.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results. Objective: This article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks. Method: For a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics. Results: MSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM. Conclusions: There is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [41] MSR4P and S 2022 - Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security, co-located with ESEC/FSE 2022
    MSR4P and S 2022 - Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security, co-located with ESEC/FSE 2022, 2022,
  • [42] Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report
    Shang, Weiyi
    Adams, Bram
    Hassan, Ahmed E.
    JOURNAL OF SYSTEMS AND SOFTWARE, 2012, 85 (10) : 2195 - 2204
  • [43] Mining Component-Based Software Behavioral Models Using Dynamic Analysis
    Lu, Ting
    Liu, Cong
    Duan, Hua
    Zeng, Qingtian
    IEEE ACCESS, 2020, 8 : 68883 - 68894
  • [44] Automatic Stop Word Generation for Mining Software Artifact Using Topic Model with Pointwise Mutual Information
    Lee, Jung-Been
    Lee, Taek
    In, Hoh Peter
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (09): : 1761 - 1772
  • [45] Field studies using functional size measurement in building estimation models for software maintenance
    Abran, A
    Silva, I
    Primera, L
    JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE, 2002, 14 (01): : 31 - 64
  • [46] Solution the tasks of geological-engineering models building using the software complex "BASPRO Optima"
    Medvedev, EA
    Piankov, VN
    NEFTYANOE KHOZYAISTVO, 2005, (10): : 51 - 53
  • [47] Using Relational Topic Models to Capture Coupling among Classes in Object-Oriented Software Systems
    Gethers, Malcom
    Poshyvanyk, Denys
    2010 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2010,
  • [48] An Empirical Study on the Effect of Testing on Code Quality Using Topic Models: A Case Study on Software Development Systems
    Chen, Tse-Hsun
    Thomas, Stephen W.
    Hemmati, Hadi
    Nagappan, Meiyappan
    Hassan, Ahmed E.
    IEEE TRANSACTIONS ON RELIABILITY, 2017, 66 (03) : 806 - 824
  • [49] Assessing software maintenance tool utilization using task-technology fit and fitness-for-use models
    Dishaw, Mark T.
    Strong, Diane M.
    Journal of Software Maintenance, 1998, 10 (03): : 151 - 179
  • [50] Assessing software maintenance tool utilization using task-technology fit and fitness-for-use models
    Dishaw, MT
    Strong, DM
    JOURNAL OF SOFTWARE MAINTENANCE-RESEARCH AND PRACTICE, 1998, 10 (03): : 151 - 179