Evaluating Unsupervised Text Embeddings on Software User Feedback

被引:11
|
作者
Devine, Peter [1 ]
Koh, Yun Sing [1 ]
Blincoe, Kelly [1 ]
机构
[1] Univ Auckland, Auckland, New Zealand
关键词
REVIEWS; MODELS;
D O I
10.1109/REW53955.2021.00020
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
User feedback on software products has been shown to be useful for development and can be exceedingly abundant online. Many approaches have been developed to elicit requirements in different ways from this large volume of feedback, including the use of unsupervised clustering, underpinned by text embeddings. Methods for embedding text can vary significantly within the literature, highlighting the lack of a consensus as to which approaches are best able to cluster user feedback into requirements relevant groups. This work proposes a methodology for comparing text embeddings of user feedback using existing labelled datasets. Using 7 diverse datasets from the literature, we apply this methodology to evaluate both established text embedding techniques from the user feedback analysis literature (including topic modelling and word embeddings) as well as text embeddings from state of the art deep text embedding models. Results demonstrate that text embeddings produced by state of the art models, most notably the Universal Sentence Encoder (USE), group feedback with similar requirements relevant characteristics together better than other evaluated techniques across all seven datasets. These results can help researchers select appropriate embedding techniques when developing future unsupervised clustering approaches within user feedback analysis.
引用
收藏
页码:87 / 95
页数:9
相关论文
共 50 条
  • [41] Unspeech: Unsupervised Speech Context Embeddings
    Milde, Benjamin
    Biemann, Chris
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2693 - 2697
  • [42] Evaluating Multilevel User Skill Expression in a Public, Unsupervised Wiki: A Case Study
    Trice, Michael
    IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, 2016, 59 (03) : 261 - 273
  • [43] Unsupervised Learning of Disentangled Location Embeddings
    Ouyang, Kun
    Liang, Yuxuan
    Liu, Ye
    Rosenblum, David S.
    Yang, Wenzhuo
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [44] An unsupervised framework for comparing graph embeddings
    Kaminski, Bogumil
    Pralat, Pawel
    Theberge, Francois
    JOURNAL OF COMPLEX NETWORKS, 2020, 8 (05)
  • [45] A biological text retrieval system based on background knowledge and user feedback
    Hu, Meng
    Yang, Jiong
    DATA MINING AND BIOINFORMATICS, 2006, 4316 : 50 - +
  • [46] Unsupervised Alignment of Distributional Word Embeddings
    Diallo, Aissatou
    Fuernkranz, Johannes
    ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2022, 2022, 13404 : 60 - 74
  • [47] Unsupervised Alignment of Embeddings with Wasserstein Procrustes
    Grave, Edouard
    Joulin, Armand
    Berthet, Quentin
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [48] Text clustering with limited user feedback under local metric learning
    Huang, Ruizhang
    Zhang, Zhigang
    Lam, Wai
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 132 - 144
  • [49] Connecting Supervised and Unsupervised Sentence Embeddings
    Levi, Gil
    REPRESENTATION LEARNING FOR NLP, 2018, : 79 - 83
  • [50] Text classification with document embeddings
    Huang, Chaochao (chaochaohuang12@fudan.edu.cn), 1600, Springer Verlag (8801):