Evaluating Unsupervised Text Embeddings on Software User Feedback

被引:11
|
作者
Devine, Peter [1 ]
Koh, Yun Sing [1 ]
Blincoe, Kelly [1 ]
机构
[1] Univ Auckland, Auckland, New Zealand
关键词
REVIEWS; MODELS;
D O I
10.1109/REW53955.2021.00020
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
User feedback on software products has been shown to be useful for development and can be exceedingly abundant online. Many approaches have been developed to elicit requirements in different ways from this large volume of feedback, including the use of unsupervised clustering, underpinned by text embeddings. Methods for embedding text can vary significantly within the literature, highlighting the lack of a consensus as to which approaches are best able to cluster user feedback into requirements relevant groups. This work proposes a methodology for comparing text embeddings of user feedback using existing labelled datasets. Using 7 diverse datasets from the literature, we apply this methodology to evaluate both established text embedding techniques from the user feedback analysis literature (including topic modelling and word embeddings) as well as text embeddings from state of the art deep text embedding models. Results demonstrate that text embeddings produced by state of the art models, most notably the Universal Sentence Encoder (USE), group feedback with similar requirements relevant characteristics together better than other evaluated techniques across all seven datasets. These results can help researchers select appropriate embedding techniques when developing future unsupervised clustering approaches within user feedback analysis.
引用
收藏
页码:87 / 95
页数:9
相关论文
共 50 条
  • [1] Retrofitting Embeddings for Unsupervised User Identity Linkage
    Zhou, Tao
    Lim, Ee-Peng
    Lee, Roy Ka-Wei
    Zhu, Feida
    Cao, Jiuxin
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 385 - 397
  • [2] Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource
    Tulkens, Stephan
    Emmery, Chris
    Daelemans, Walter
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4130 - 4136
  • [3] Unsupervised Framework for Evaluating Structural Node Embeddings of Graphs
    Dehghan, Ashkan
    Siuta, Kinga
    Skorupka, Agata
    Betlen, Andrei
    Miller, David
    Kaminski, Bogumil
    Pralat, Pawel
    ALGORITHMS AND MODELS FOR THE WEB GRAPH, WAW 2023, 2023, 13894 : 36 - 51
  • [4] Evaluating software user feedback classifier performance on unseen apps, datasets, and metadata
    Devine, Peter
    Koh, Yun Sing
    Blincoe, Kelly
    EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (02)
  • [5] Evaluating software user feedback classifier performance on unseen apps, datasets, and metadata
    Peter Devine
    Yun Sing Koh
    Kelly Blincoe
    Empirical Software Engineering, 2023, 28
  • [6] Towards Unsupervised Text Classification Leveraging Experts and Word Embeddings
    Haj-Yahia, Zied
    Sieg, Adrien
    Deleris, Lea A.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 371 - 379
  • [7] Unsupervised framework for evaluating and explaining structural node embeddings of graphs
    Dehghan, Ashkan
    Siuta, Kinga
    Skorupka, Agata
    Betlen, Andrei
    Miller, David
    Kaminski, Bogumil
    Pralat, Pawel
    JOURNAL OF COMPLEX NETWORKS, 2024, 12 (02)
  • [8] Evaluating the construct validity of text embeddings with application to survey questions
    Fang, Qixiang
    Nguyen, Dong
    Oberski, Daniel L.
    EPJ DATA SCIENCE, 2022, 11 (01)
  • [9] Evaluating the construct validity of text embeddings with application to survey questions
    Qixiang Fang
    Dong Nguyen
    Daniel L. Oberski
    EPJ Data Science, 11
  • [10] An ambient software monitoring system for unsupervised user modelling
    Stephen, B
    Petropoulakis, L
    EXPERT SYSTEMS WITH APPLICATIONS, 2005, 28 (03) : 557 - 567