No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

被引：55

作者：

Wang, Chaozheng ^{[1
]}

Yang, Yuanhang ^{[1
]}

Gao, Cuiyun ^{[1
,4
,5
]}

Peng, Yun ^{[2
]}

Zhang, Hongyu ^{[3
]}

Lyu, Michael R. ^{[2
]}

机构：

[1] Harbin Inst Technol, Shenzhen, Peoples R China

[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[3] Univ Newcastle, Newcastle, NSW, Australia

[4] Peng Cheng Lab, Shenzhen, Peoples R China

[5] Guangdong Prov Key Lab Novel Secur Intelligence T, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

code intelligence; prompt tuning; empirical study;

D O I：

10.1145/3540250.3549113

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Pre-trained models have been shown effective in many code intelligence tasks. These models are pre-trained on large-scale unlabeled corpus and then fine-tuned in downstream tasks. However, as the inputs to pre-training and downstream tasks are in different forms, it is hard to fully explore the knowledge of pre-trained models. Besides, the performance of fine-tuning strongly relies on the amount of downstream data, while in practice, the scenarios with scarce data are common. Recent studies in the natural language processing (NLP) field show that prompt tuning, a new paradigm for tuning, alleviates the above issues and achieves promising results in various NLP tasks. In prompt tuning, the prompts inserted during tuning provide task-specific knowledge, which is especially beneficial for tasks with relatively scarce data. In this paper, we empirically evaluate the usage and effect of prompt tuning in code intelligence tasks. We conduct prompt tuning on popular pre-trained models CodeBERT and CodeT5 and experiment with three code intelligence tasks including defect prediction, code summarization, and code translation. Our experimental results show that prompt tuning consistently outperforms fine-tuning in all three tasks. In addition, prompt tuning shows great potential in low-resource scenarios, e.g., improving the BLEU scores of fine-tuning by more than 26% on average for code summarization. Our results suggest that instead of fine-tuning, we could adapt prompt tuning for code intelligence tasks to achieve better performance, especially when lacking task-specific data.

引用

页码：382 / 394

页数：13

共 50 条

[1] Prompt Tuning in Code Intelligence: An Experimental Evaluation
Wang, Chaozheng
Yang, Yuanhang
Gao, Cuiyun
Peng, Yun
Zhang, Hongyu
Lyu, Michael R.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (11) : 4869 - 4885
[2] Fine-Tuning Teacher Evaluation
Marshall, Kim
EDUCATIONAL LEADERSHIP, 2012, 70 (03) : 50 - 53
[3] P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks
Liu, Xiao
Ji, Kaixuan
Fu, Yicheng
Tam, Weng Lam
Du, Zhengxiao
Yang, Zhilin
Tang, Jie
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 61 - 68
[4] Fine-tuning the artificial intelligence experience in endoscopy
Siau, Keith
Berzin, Tyler M.
UNITED EUROPEAN GASTROENTEROLOGY JOURNAL, 2022, 10 (05) : 449 - 450
[5] Fine-tuning
不详
AVIATION WEEK & SPACE TECHNOLOGY, 2001, 155 (02): : 21 - 21
[6] Fine-tuning
Rachel Smallridge
Nature Reviews Molecular Cell Biology, 2004, 5 (2) : 79 - 79
[7] Fine-Tuning
Manson, Neil A.
TPM-THE PHILOSOPHERS MAGAZINE, 2019, (86): : 99 - 105
[8] Fine-tuning
不详
MECHANICAL ENGINEERING, 2007, 129 (03) : 23 - 23
[9] Fine-tuning and prompt engineering for large language models-based code review automation
Pornprasit, Chanathip
Tantithamthavorn, Chakkrit
INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
[10] Tuning and Fine-Tuning of Synapses with Adenosine
Sebastiao, A. M.
Ribeiro, J. A.
CURRENT NEUROPHARMACOLOGY, 2009, 7 (03) : 180 - 194

← 1 2 3 4 5 →