A feature location approach for mapping application features extracted from crowd-based screencasts to source code

被引:0
|
作者
Parisa Moslehi
Bram Adams
Juergen Rilling
机构
[1] Concordia Universitys,
[2] Queen’s University,undefined
来源
关键词
Crowd-based documentation; Mining video content; Speech analysis; Feature location; Software traceability; Information extraction; Software documentation;
D O I
暂无
中图分类号
学科分类号
摘要
Crowd-based multimedia documents such as screencasts have emerged as a source for documenting requirements, the workflow and implementation issues of open source and agile software projects. For example, users can show and narrate how they manipulate an application’s GUI to perform a certain functionality, or a bug reporter could visually explain how to trigger a bug or a security vulnerability. Unfortunately, the streaming nature of programming screencasts and their binary format limit how developers can interact with a screencast’s content. In this research, we present an automated approach for mining and linking the multimedia content found in screencasts to their relevant software artifacts and, more specifically, to source code. We apply LDA-based mining approaches that take as input a set of screencast artifacts, such as GUI text and spoken word, to make the screencast content accessible and searchable to users and to link it to their relevant source code artifacts. To evaluate the applicability of our approach, we report on results from case studies that we conducted on existing WordPress and Mozilla Firefox screencasts. We found that our automated approach can significantly speed up the feature location process. For WordPress, we find that our approach using screencast speech and GUI text can successfully link relevant source code files within the top 10 hits of the result set with median Reciprocal Rank (RR) of 50% (rank 2) and 100% (rank 1). In the case of Firefox, our approach can identify relevant source code directories within the top 100 hits using screencast speech and GUI text with the median RR = 20%, meaning that the first true positive is ranked 5 or higher in more than 50% of the cases. Also, source code related to the frontend implementation that handles high-level or GUI-related aspects of an application is located with higher accuracy. We also found that term frequency rebalancing can further improve the linking results when using less noisy scenarios or locating less technical implementation of scenarios. Investigating the results of using original and weighted screencast data sources (speech, GUI, speech and GUI) that can result in having the highest median RR values in both case studies shows that speech data is an important information source that can result in having RR of 100%.
引用
收藏
页码:4873 / 4926
页数:53
相关论文
共 45 条
  • [1] A feature location approach for mapping application features extracted from crowd-based screencasts to source code
    Moslehi, Parisa
    Adams, Bram
    Rilling, Juergen
    EMPIRICAL SOFTWARE ENGINEERING, 2020, 25 (06) : 4873 - 4926
  • [2] Feature Location using Crowd-based Screencasts
    Moslehi, Parisa
    Adams, Bram
    Rilling, Juergen
    2018 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2018, : 192 - 202
  • [3] Feature location enhancement based on source code augmentation with synonyms of terms
    Saifan, Ahmad A.
    Obeidat, Lana
    SOFTWARE-PRACTICE & EXPERIENCE, 2021, 51 (02): : 235 - 259
  • [4] An efficient and secure feature location approach in source code using Jacobian matrix-based clustering
    Balaji, N.
    Lakshmi, S.
    Anand, M.
    Anbarasan, M.
    Mathiyalagan, P.
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (12): : 7235 - 7247
  • [5] An efficient and secure feature location approach in source code using Jacobian matrix-based clustering
    N. Balaji
    S. Lakshmi
    M. Anand
    M. Anbarasan
    P. Mathiyalagan
    Neural Computing and Applications, 2021, 33 : 7235 - 7247
  • [6] An approach for mapping features to code based on static and dynamic analysis
    Rohatgi, Abhishek
    Hamou-Lhadj, Abdelwahab
    Rilling, Juergen
    PROCEEDINGS OF THE 16TH IEEE INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, 2008, : 234 - 239
  • [7] Radiomics approach to quantify shape irregularity from crowd-based qualitative assessment of intracranial aneurysms
    Juchler, Norman
    Schilling, Sabine
    Gluge, Stefan
    Bijlenga, Philippe
    Rufenacht, Daniel
    Kurtcuoglu, Vartan
    Hirsch, Sven
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2020, 8 (05): : 538 - 546
  • [8] Feature Location in Source Code by Trace-Based Impact Analysis and Information Retrieval
    Cai, Zhengong
    Yang, Xiaohu
    Wang, Xinyu
    Kavs, Aleksander J.
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (01) : 205 - 214
  • [9] What Code Implements Such Service? A Behavior Model Based Feature Location Approach
    Liang, Guangtai
    Dang, Yabin
    Chen, Hao
    Mei, Lijun
    Li, Shaochun
    Chee, Yi-Min
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2016), 2016, : 122 - 129
  • [10] Software product lines and features from the perspective of set theory with an application to feature location
    Eisenecker, Ulrich
    Mueller, Richard
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 210