Two Birds with One Stone: Boosting Code Generation and Code Search via a Generative Adversarial Network

被引:4
|
作者
Wang, Shangwen [1 ,6 ]
Lin, Bo [1 ]
Sun, Zhensu [2 ]
Wen, Ming [3 ]
Liu, Yepang [4 ]
Lei, Yan [5 ]
Mao, Xiaoguang [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Key Lab Software Engn Complex Syst, Changsha, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Huazhong Univ Sci & Technol, Sch Cyber Sci & Engn, Wuhan, Peoples R China
[4] Southern Univ Sci & Technol, Res Inst Trustworthy Autonoumous Syst, Dept Comp Sci & Engn, Shenzhen, Peoples R China
[5] Chongqing Univ, Chongqing, Peoples R China
[6] Southern Univ Sci & Technol, Shenzhen, Peoples R China
来源
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Code Generation; Code Search; Generative Adversarial Network;
D O I
10.1145/3622815
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatically transforming developers' natural language descriptions into source code has been a longstanding goal in software engineering research. Two types of approaches have been proposed in the literature to achieve this: code generation, which involves generating a new code snippet, and code search, which involves reusing existing code. However, despite existing efforts, the effectiveness of the state-of-the-art techniques remains limited. To seek for further advancement, our insight is that code generation and code search can help overcome the limitation of each other: the code generator can benefit from feedback on the quality of its generated code, which can be provided by the code searcher, while the code searcher can benefit from the additional training data augmented by the code generator to better understand code semantics. Drawing on this insight, we propose a novel approach that combines code generation and code search techniques using a generative adversarial network (GAN), enabling mutual improvement through the adversarial training. Specifically, we treat code generation and code search as the generator and discriminator in the GAN framework, respectively, and incorporate several customized designs for our tasks. We evaluate our approach in eight different settings, and consistently observe significant performance improvements for both code generation and code search. For instance, when using NatGen, a state-of-the-art code generator, as the generator and GraphCodeBERT, a state-of-the-art code searcher, as the discriminator, we achieve a 32% increase in CodeBLEU score for code generation, and a 12% increase in mean reciprocal rank for code search on a large-scale Python dataset, compared to their original performances.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Code Search with Generative Adversarial Game
    Zhang, Xiang-Ping
    Liu, Jian-Xun
    Hu, Hai-Ze
    Liu, Yi
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (12): : 5382 - 5396
  • [2] Code-Switching Sentence Generation by Bert and Generative Adversarial Networks
    Gao, Yingying
    Feng, Junlan
    Liu, Ying
    Hou, Leijing
    Pan, Xin
    Ma, Yong
    INTERSPEECH 2019, 2019, : 3525 - 3529
  • [3] Two Birds With One Stone: Boosting Both Search and Write Performance for Tree Indices on Persistent Memory
    Luo, Yongping
    Jin, Peiquan
    Zhang, Zhou
    Zhang, Junchen
    Cheng, Bin
    Zhang, Qinglin
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [4] Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation
    Chang, Ching-Ting
    Chuang, Shun-Po
    Lee, Hung-Yi
    INTERSPEECH 2019, 2019, : 554 - 558
  • [5] Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation
    Haidar, Md. Akmal
    Rezagholizadeh, Mehdi
    Do-Omri, Alan
    Rashid, Ahmad
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2248 - 2258
  • [6] EXOCENTRIC TO EGOCENTRIC IMAGE GENERATION VIA PARALLEL GENERATIVE ADVERSARIAL NETWORK
    Liu, Gaowen
    Tang, Hao
    Latapie, Hugo
    Yard, Yan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1843 - 1847
  • [7] Power Message Generation in Smart Grid via Generative Adversarial Network
    Ying, Huan
    Ouyang, Xuan
    Miao, Siwei
    Cheng, Yushi
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 790 - 793
  • [8] Customizable text generation via conditional text generative adversarial network
    Chen, Jinyin
    Wu, Yangyang
    Jia, Chengyu
    Zheng, Haibin
    Huang, Guohan
    NEUROCOMPUTING, 2020, 416 (416) : 125 - 135
  • [9] Kill Two Birds with One Stone: Domain Generalization for Semantic Segmentation via Network Pruning
    Luo, Yawei
    Liu, Ping
    Yang, Yi
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (01) : 335 - 352
  • [10] Two Birds, One Stone: Jointly Learning Binary Code for Large-scale Face Image Retrieval and Attributes Prediction
    Li, Yan
    Wang, Ruiping
    Liu, Haomiao
    Jiang, Huajie
    Shan, Shiguang
    Chen, Xilin
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3819 - 3827