Aspect-Based API Review Classification: How Far Can Pre-Trained Transformer Model Go?

被引:17
|
作者
Yang, Chengran [1 ]
Xu, Bowen [1 ]
Khan, Junaed Younus [2 ]
Uddin, Gias [2 ]
Han, Donggyun [1 ]
Yang, Zhou [1 ]
Lo, David [1 ]
机构
[1] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
[2] Univ Calgary, Dept Elect & Comp Engn, Calgary, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
software mining; natural language processing; multi-label classification; pre-trained models;
D O I
10.1109/SANER53432.2022.00054
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
APIs (Application Programming Interfaces) are reusable software libraries and are building blocks for modern rapid software development. Previous research shows that programmers frequently share and search for reviews of APIs on the mainstream software question and answer (Q&A) platforms like Stack Overflow, which motivates researchers to design tasks and approaches related to process API reviews automatically. Among these tasks, classifying API reviews into different aspects (e.g., performance or security), which is called the aspect-based API review classification, is of great importance. The current state-of-the-art (SOTA) solution to this task is based on the traditional machine learning algorithm. Inspired by the great success achieved by pre-trained models on many software engineering tasks, this study fine-tunes six pre-trained models for the aspect-based API review classification task and compares them with the current SOTA solution on an API review benchmark collected by Uddin et al. The investigated models include four models (BERT, RoBERTa, ALBERT and XLNet) that are pre-trained on natural languages, BERTOverflow that is pre-trained on text corpus extracted from posts on Stack Overflow, and CosSensBERT that is designed for handling imbalanced data. The results show that all the six fine-tuned models outperform the traditional machine learning-based tool. More specifically, the improvement on the F1-score ranges from 21.0% to 30.2%. We also find that BERTOverflow, a model pre-trained on the corpus from Stack Overflow, does not show better performance than BERT. The result also suggests that CosSensBERT also does not exhibit better performance than BERT in terms of F1, but it is still worthy of being considered as it achieves better performance on MCC and AUC.
引用
收藏
页码:385 / 395
页数:11
相关论文
共 50 条
  • [41] Analysis of EU's Coupled Carbon and Electricity Market Development Based on Generative Pre-Trained Transformer Large Model and Implications in China
    Li, Yao
    Ni, Siyuan
    Tang, Xi
    Xie, Sizhe
    Wang, Peng
    SUSTAINABILITY, 2024, 16 (23)
  • [42] Can pre-trained convolutional neural networks be directly used as a feature extractor for video-based neonatal sleep and wake classification?
    Muhammad Awais
    Xi Long
    Bin Yin
    Chen Chen
    Saeed Akbarzadeh
    Saadullah Farooq Abbasi
    Muhammad Irfan
    Chunmei Lu
    Xinhua Wang
    Laishuan Wang
    Wei Chen
    BMC Research Notes, 13
  • [43] Diagnosing Glaucoma Based on the Ocular Hypertension Treatment Study Dataset Using Chat Generative Pre-Trained Transformer as a Large Language Model
    Raja, Hina
    Huang, Xiaoqin
    Delsoz, Mohammad
    Madadi, Yeganeh
    Poursoroush, Asma
    Munawar, Asim
    Kahook, Malik Y.
    Yousefi, Siamak
    OPHTHALMOLOGY SCIENCE, 2025, 5 (01):
  • [44] Can pre-trained convolutional neural networks be directly used as a feature extractor for video-based neonatal sleep and wake classification?
    Awais, Muhammad
    Long, Xi
    Yin, Bin
    Chen, Chen
    Akbarzadeh, Saeed
    Abbasi, Saadullah Farooq
    Irfan, Muhammad
    Lu, Chunmei
    Wang, Xinhua
    Wang, Laishuan
    Chen, Wei
    BMC RESEARCH NOTES, 2020, 13 (01)
  • [45] A Hybrid Neural Network BERT-Cap Based on Pre-Trained Language Model and Capsule Network for User Intent Classification
    Liu, Hai
    Liu, Yuanxia
    Wong, Leung-Pun
    Lee, Lap-Kei
    Hao, Tianyong
    COMPLEXITY, 2020, 2020
  • [46] Knowledge-Aware Collaborative Filtering With Pre-Trained Language Model for Personalized Review-Based Rating Prediction
    Wang, Quanxiu
    Cao, Xinlei
    Wang, Jianyong
    Zhang, Wei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1170 - 1182
  • [47] Can using a pre-trained deep learning model as the feature extractor in the bag-of-deep-visual-words model always improve image classification accuracy?
    Xu, Ye
    Zhang, Xin
    Huang, Chongpeng
    Qiu, Xiaorong
    PLOS ONE, 2024, 19 (02):
  • [48] LastResort at SemEval-2022 Task 4: Towards Patronizing and Condescending Language Detection using Pre-trained Transformer Based Model Ensembles
    Agrawal, Samyak
    Mamidi, Radhika
    PROCEEDINGS OF THE 16TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2022, 2022, : 352 - 356
  • [49] Advanced Attention-Based Pre-Trained Transfer Learning Model for Accurate Brain Tumor Detection and Classification from MRI Images
    Priya, A.
    Vasudevan, V.
    OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (04) : 477 - 491
  • [50] Review of Swarm Fuzzy Classifier and a Convolutional Neural Network with VGG-16 Pre-Trained Model on Dental Panoramic Radiograph for Osteoporosis Classification
    Abubakar, Usman Bello
    Boukar, Moussa Mahamat
    Dane, Senol
    JOURNAL OF RESEARCH IN MEDICAL AND DENTAL SCIENCE, 2022, 10 (01): : 193 - 197