MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks

被引:41
|
作者
Sun, Xiaobing [1 ,2 ]
Li, Bixin [3 ]
Leung, Hareton [4 ]
Li, Bin [1 ]
Li, Yun [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210008, Jiangsu, Peoples R China
[3] Southeast Univ, Sch Engn & Comp Sci, Nanjing, Jiangsu, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
Software maintenance; Mining software historical repositories; Topic model; Empirical study; CHANGE IMPACT ANALYSIS; INFORMATION-RETRIEVAL; FEATURE LOCATION; SOURCE CODE; TAXONOMY;
D O I
10.1016/j.infsof.2015.05.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results. Objective: This article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks. Method: For a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics. Results: MSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM. Conclusions: There is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [21] Classification of application reviews into software maintenance tasks using data mining techniques
    Assem Al-Hawari
    Hassan Najadat
    Raed Shatnawi
    Software Quality Journal, 2021, 29 : 667 - 703
  • [22] Studying software evolution using topic models
    Thomas, Stephen W.
    Adams, Bram
    Hassan, Ahmed E.
    Blostein, Dorothea
    SCIENCE OF COMPUTER PROGRAMMING, 2014, 80 : 457 - 479
  • [23] Studying software logging using topic models
    Li, Heng
    Chen, Tse-Hsun
    Shang, Weiyi
    Hassan, Ahmed E.
    EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (05) : 2655 - 2694
  • [24] Studying software logging using topic models
    Heng Li
    Tse-Hsun (Peter) Chen
    Weiyi Shang
    Ahmed E. Hassan
    Empirical Software Engineering, 2018, 23 : 2655 - 2694
  • [25] Using Information Retrieval to Support Software Maintenance Tasks
    Poshyvanyk, Denys
    2009 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, CONFERENCE PROCEEDINGS, 2009, : 453 - 456
  • [26] Topic recommendation for software repositories using multi-label classification algorithms
    Maliheh Izadi
    Abbas Heydarnoori
    Georgios Gousios
    Empirical Software Engineering, 2021, 26
  • [27] Topic recommendation for software repositories using multi-label classification algorithms
    Izadi, Maliheh
    Heydarnoori, Abbas
    Gousios, Georgios
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (05)
  • [28] Mining Individual Performance Indicators in Collaborative Development Using Software Repositories
    Zhang, Shen
    Wang, Yongji
    Xiao, Junchao
    APSEC 2008:15TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2008, : 247 - 254
  • [29] Using Language-Based Search in Mining Large Software Repositories
    Abu Bakar, Normi Sham Awang
    COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 160 - 168
  • [30] Using Probabilistic Topic Models in Enterprise Social Software
    Christidis, Konstantinos
    Mentzas, Gregoris
    BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, 2010, 47 : 23 - 34