MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks

被引:41
|
作者
Sun, Xiaobing [1 ,2 ]
Li, Bixin [3 ]
Leung, Hareton [4 ]
Li, Bin [1 ]
Li, Yun [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210008, Jiangsu, Peoples R China
[3] Southeast Univ, Sch Engn & Comp Sci, Nanjing, Jiangsu, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
Software maintenance; Mining software historical repositories; Topic model; Empirical study; CHANGE IMPACT ANALYSIS; INFORMATION-RETRIEVAL; FEATURE LOCATION; SOURCE CODE; TAXONOMY;
D O I
10.1016/j.infsof.2015.05.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results. Objective: This article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks. Method: For a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics. Results: MSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM. Conclusions: There is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [1] Mining Software Repositories Using Topic Models
    Thomas, Stephen W.
    2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 1138 - 1139
  • [2] A survey on the use of topic models when mining software repositories
    Tse-Hsun Chen
    Stephen W. Thomas
    Ahmed E. Hassan
    Empirical Software Engineering, 2016, 21 : 1843 - 1919
  • [3] A survey on the use of topic models when mining software repositories
    Chen, Tse-Hsun
    Thomas, Stephen W.
    Hassan, Ahmed E.
    EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (05) : 1843 - 1919
  • [4] MSR 2004 - International Workshop on Mining Software Repositories
    Hassan, AE
    Holt, RC
    Mockus, A
    ICSE 2004: 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2004, : 770 - 771
  • [5] MSR 2005 - International workshop on mining software repositories
    Hassan, AE
    Holt, RC
    Diehl, S
    ICSE 05: 27th International Conference on Software Engineering, Proceedings, 2005, : 690 - 690
  • [6] MSR 2007 4th International Workshop on Mining Software Repositories
    Gall, Harald
    Lanza, Michele
    Zimmermann, Thomas
    29TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: ICSE 2007 COMPANION VOLUME, PROCEEDINGS, 2007, : 107 - +
  • [7] Using Topic Models to Support Software Maintenance
    Grant, Scott
    Cordy, James R.
    Skillicorn, David B.
    2012 16TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2012, : 403 - 408
  • [8] MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR)
    Shang, Weiyi
    Jiang, Zhen Ming
    Adams, Bram
    Hassan, Ahmed E.
    2009 6TH IEEE INTERNATIONAL WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES, 2009, : 21 - 30
  • [9] Mining software repositories for comprehensible software fault prediction models
    Vandecruys, Olivier
    Martens, David
    Baesens, Bart
    Mues, Christophe
    De Backer, Manu
    Haesen, Raf
    JOURNAL OF SYSTEMS AND SOFTWARE, 2008, 81 (05) : 823 - 839
  • [10] SamikshaUmbra: Contribution and Performance Assessment of Software Maintenance Professionals by Mining Software Repositories
    Rastogi, Ayushi
    Sureka, Ashish
    2013 20TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2013), VOL 2, 2013, : 170 - 175