Two Birds with One Stone: Boosting Code Generation and Code Search via a Generative Adversarial Network

被引:4
|
作者
Wang, Shangwen [1 ,6 ]
Lin, Bo [1 ]
Sun, Zhensu [2 ]
Wen, Ming [3 ]
Liu, Yepang [4 ]
Lei, Yan [5 ]
Mao, Xiaoguang [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Key Lab Software Engn Complex Syst, Changsha, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Huazhong Univ Sci & Technol, Sch Cyber Sci & Engn, Wuhan, Peoples R China
[4] Southern Univ Sci & Technol, Res Inst Trustworthy Autonoumous Syst, Dept Comp Sci & Engn, Shenzhen, Peoples R China
[5] Chongqing Univ, Chongqing, Peoples R China
[6] Southern Univ Sci & Technol, Shenzhen, Peoples R China
来源
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Code Generation; Code Search; Generative Adversarial Network;
D O I
10.1145/3622815
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatically transforming developers' natural language descriptions into source code has been a longstanding goal in software engineering research. Two types of approaches have been proposed in the literature to achieve this: code generation, which involves generating a new code snippet, and code search, which involves reusing existing code. However, despite existing efforts, the effectiveness of the state-of-the-art techniques remains limited. To seek for further advancement, our insight is that code generation and code search can help overcome the limitation of each other: the code generator can benefit from feedback on the quality of its generated code, which can be provided by the code searcher, while the code searcher can benefit from the additional training data augmented by the code generator to better understand code semantics. Drawing on this insight, we propose a novel approach that combines code generation and code search techniques using a generative adversarial network (GAN), enabling mutual improvement through the adversarial training. Specifically, we treat code generation and code search as the generator and discriminator in the GAN framework, respectively, and incorporate several customized designs for our tasks. We evaluate our approach in eight different settings, and consistently observe significant performance improvements for both code generation and code search. For instance, when using NatGen, a state-of-the-art code generator, as the generator and GraphCodeBERT, a state-of-the-art code searcher, as the discriminator, we achieve a 32% increase in CodeBLEU score for code generation, and a 12% increase in mean reciprocal rank for code search on a large-scale Python dataset, compared to their original performances.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] Two birds with one stone: A similarity-guaranteed clustering algorithm and its search tree
    Kantabutra, S
    Bunkhumpornpat, C
    TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : B251 - B254
  • [22] Rough Set Soft Computing Cancer Classification and Network: One Stone, Two Birds
    Zhang, Yue
    CANCER INFORMATICS, 2010, 9 : 139 - 145
  • [23] Microstructure Generation via Generative Adversarial Network for Heterogeneous, Topologically Complex 3D Materials
    Hsu, Tim
    Epting, William K.
    Kim, Hokon
    Abernathy, Harry W.
    Hackett, Gregory A.
    Rollett, Anthony D.
    Salvador, Paul A.
    Holm, Elizabeth A.
    JOM, 2021, 73 (01) : 90 - 102
  • [24] Microstructure Generation via Generative Adversarial Network for Heterogeneous, Topologically Complex 3D Materials
    Tim Hsu
    William K. Epting
    Hokon Kim
    Harry W. Abernathy
    Gregory A. Hackett
    Anthony D. Rollett
    Paul A. Salvador
    Elizabeth A. Holm
    JOM, 2021, 73 : 90 - 102
  • [25] Skin Texture Generation via Blue-noise Gabor Filtering based Generative Adversarial Network
    Zhang, Hui
    Wang, Chuan
    Chen, Nenglun
    Wang, Jue
    Wang, Wenping
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2030 - 2038
  • [26] Remote Sensing Image Spatiotemporal Fusion via a Generative Adversarial Network With One Prior Image Pair
    Song, Yiyao
    Zhang, Hongyan
    Huang, He
    Zhang, Liangpei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [27] Two Birds with One Stone: Contemporaneously Boosting OER Activity and Kinetics for Layered Double Hydroxide Inspired by Photosystem II
    Lin, Xiaojing
    Cao, Shoufu
    Chen, Xiaodong
    Chen, Hongyu
    Wang, Zhaojie
    Liu, Huanhuan
    Xu, Hui
    Liu, Siyuan
    Wei, Shuxian
    Lu, Xiaoqing
    ADVANCED FUNCTIONAL MATERIALS, 2022, 32 (27)
  • [28] Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data
    Shahidi, Hamidreza
    Li, Ming
    Lin, Jimmy
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3864 - 3870
  • [29] Unveiling two-dimensional magnesium hydride as a hydrogen storage material via a generative adversarial network
    Lee, Junho
    Sung, Dongchul
    Chung, You Kyoung
    Bin Song, Seon
    Huh, Joonsuk
    NANOSCALE ADVANCES, 2022, 4 (10): : 2332 - 2338
  • [30] Optimizing Transaction Schedules on Universal Quantum Computers via Code Generation for Grover's Search Algorithm
    Groppe, Sven
    Groppe, Jinghua
    IDEAS 2021: 25TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM, 2021, : 149 - 156