A syntax-guided multi-task learning approach for Turducken-style code generation

被引：2

作者：

Yang, Guang ^{[1
]}

Zhou, Yu ^{[1
]}

Chen, Xiang ^{[2
]}

Zhang, Xiangyu ^{[1
]}

Xu, Yiran ^{[1
]}

Han, Tingting ^{[3
]}

Chen, Taolue ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

[2] Nantong Univ, Sch Informat Sci & Technol, Nantong, Peoples R China

[3] Birkbeck Univ London, Dept Comp Sci, London, England

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2023年 / 28卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Syntactically-constrained code generation; Turducken-style code; Multi-task learning; CodeT5; Abstract syntax tree;

D O I：

10.1007/s10664-023-10372-1

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated code will not always adhere to syntactic constraints of the target language, especially in the case of Turducken-style code, where declarative code snippets are embedded within imperative programs. In this study, we summarize three significant challenges in regards to syntactic constraints: (1) the efficient representation of syntactic constraints, (2) the effective integration of syntactic information, and (3) the scalable syntax-first decoding algorithm. To address these challenges, we propose a syntax-guided multi-task learning approach TurduckenGen. Specifically, we first explicitly append the type information to the code tokens to capture the representation of syntactic constraints. Then we formalize code generation with syntactic constraint representation as an auxiliary task to enable the model to learn the syntactic constraints of the code. Finally, the syntactically correct code is selected accurately from the multiple candidates with the help of the compiler feedback. Extensive experiments and comprehensive analysis demonstrate the effectiveness and general applicability of our approach after being compared with six state-of-the-art baselines on two Turducken-style code datasets. Finally, we conducted a human study and found the code quality generated by our approach is better than baselines in terms of code readability and semantic similarity.

引用

页数：35

共 50 条

[41] Multi-task Ada code generation from synchronous dataflow programs on multi-core: Approach and industrial study
Yang, Zhibin
Yuan, Shenghao
Bodeveix, Jean-Paul
Filali, Mamoun
Wang, Tiexin
Zhou, Yong
SCIENCE OF COMPUTER PROGRAMMING, 2021, 207
[42] Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue
Zhu, Chenguang
Zeng, Michael
Huang, Xuedong
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1261 - 1266
[43] A text guided multi-task learning network for multimodal sentiment analysis
Luo, Yuanyi
Wu, Rui
Liu, Jiafeng
Tang, Xianglong
NEUROCOMPUTING, 2023, 560
[44] AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning
Yang, Enneng
Pan, Junwei
Wang, Ximei
Yu, Haibin
Shen, Li
Chen, Xihua
Xiao, Lei
Jiang, Jie
Guo, Guibing
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10745 - 10753
[45] A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
Xia, Qingrong
Li, Zhenghua
Zhang, Min
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5382 - 5392
[46] FORWARD DIFFUSION GUIDED RECONSTRUCTION AS A MULTI-MODAL MULTI-TASK LEARNING SCHEME
Sarker, Najibul Haque
Rahman, M. Sohel
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3180 - 3184
[47] A Multi-Task Learning Approach to Sarcasm Detection (Student Abstract)
Savini, Edoardo
Caragea, Cornelia
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13907 - 13908
[48] A Multi-Task Learning Approach for Recommendation based on Knowledge Graph
Yan, Cairong
Liu, Shuai
Zhang, Yanting
Wang, Zijian
Wang, Pengwei
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[49] A Multi-task Learning Approach for Weather Classification on Railway Transportation
Wang, Shan
Li, Yidong
Feng, Songhe
2018 INTERNATIONAL CONFERENCE ON INTELLIGENT RAIL TRANSPORTATION (ICIRT), 2018,
[50] Conversion Prediction with Delayed Feedback: A Multi-task Learning Approach
Hou, Yilin
Zhao, Guangming
Liu, Chuanren
Zu, Zhonglin
Zhu, Xiaoqiang
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 191 - 199

← 1 2 3 4 5 →