A syntax-guided multi-task learning approach for Turducken-style code generation

被引:2
|
作者
Yang, Guang [1 ]
Zhou, Yu [1 ]
Chen, Xiang [2 ]
Zhang, Xiangyu [1 ]
Xu, Yiran [1 ]
Han, Tingting [3 ]
Chen, Taolue [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
[2] Nantong Univ, Sch Informat Sci & Technol, Nantong, Peoples R China
[3] Birkbeck Univ London, Dept Comp Sci, London, England
基金
中国国家自然科学基金;
关键词
Syntactically-constrained code generation; Turducken-style code; Multi-task learning; CodeT5; Abstract syntax tree;
D O I
10.1007/s10664-023-10372-1
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Due to the development of pre-trained language models, automated code generation techniques have shown great promise in recent years. However, the generated code will not always adhere to syntactic constraints of the target language, especially in the case of Turducken-style code, where declarative code snippets are embedded within imperative programs. In this study, we summarize three significant challenges in regards to syntactic constraints: (1) the efficient representation of syntactic constraints, (2) the effective integration of syntactic information, and (3) the scalable syntax-first decoding algorithm. To address these challenges, we propose a syntax-guided multi-task learning approach TurduckenGen. Specifically, we first explicitly append the type information to the code tokens to capture the representation of syntactic constraints. Then we formalize code generation with syntactic constraint representation as an auxiliary task to enable the model to learn the syntactic constraints of the code. Finally, the syntactically correct code is selected accurately from the multiple candidates with the help of the compiler feedback. Extensive experiments and comprehensive analysis demonstrate the effectiveness and general applicability of our approach after being compared with six state-of-the-art baselines on two Turducken-style code datasets. Finally, we conducted a human study and found the code quality generated by our approach is better than baselines in terms of code readability and semantic similarity.
引用
收藏
页数:35
相关论文
共 50 条
  • [41] Multi-task Ada code generation from synchronous dataflow programs on multi-core: Approach and industrial study
    Yang, Zhibin
    Yuan, Shenghao
    Bodeveix, Jean-Paul
    Filali, Mamoun
    Wang, Tiexin
    Zhou, Yong
    SCIENCE OF COMPUTER PROGRAMMING, 2021, 207
  • [42] Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue
    Zhu, Chenguang
    Zeng, Michael
    Huang, Xuedong
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1261 - 1266
  • [43] A text guided multi-task learning network for multimodal sentiment analysis
    Luo, Yuanyi
    Wu, Rui
    Liu, Jiafeng
    Tang, Xianglong
    NEUROCOMPUTING, 2023, 560
  • [44] AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning
    Yang, Enneng
    Pan, Junwei
    Wang, Ximei
    Yu, Haibin
    Shen, Li
    Chen, Xihua
    Xiao, Lei
    Jiang, Jie
    Guo, Guibing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10745 - 10753
  • [45] A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
    Xia, Qingrong
    Li, Zhenghua
    Zhang, Min
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5382 - 5392
  • [46] FORWARD DIFFUSION GUIDED RECONSTRUCTION AS A MULTI-MODAL MULTI-TASK LEARNING SCHEME
    Sarker, Najibul Haque
    Rahman, M. Sohel
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3180 - 3184
  • [47] A Multi-Task Learning Approach to Sarcasm Detection (Student Abstract)
    Savini, Edoardo
    Caragea, Cornelia
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13907 - 13908
  • [48] A Multi-Task Learning Approach for Recommendation based on Knowledge Graph
    Yan, Cairong
    Liu, Shuai
    Zhang, Yanting
    Wang, Zijian
    Wang, Pengwei
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [49] A Multi-task Learning Approach for Weather Classification on Railway Transportation
    Wang, Shan
    Li, Yidong
    Feng, Songhe
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT RAIL TRANSPORTATION (ICIRT), 2018,
  • [50] Conversion Prediction with Delayed Feedback: A Multi-task Learning Approach
    Hou, Yilin
    Zhao, Guangming
    Liu, Chuanren
    Zu, Zhonglin
    Zhu, Xiaoqiang
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 191 - 199