CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

被引:71
|
作者
Yao, Ziyu [1 ]
Peddamail, Jayavardhan Reddy [1 ]
Sun, Huan [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
关键词
Code Annotation; Code Retrieval; Reinforcement Learning;
D O I
10.1145/3308558.3313632
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code snippets relevant to a given natural language query from a code base, and code annotation, where the goal is to annotate a code snippet with a natural language description. Despite their advancement in recent years, the two tasks are mostly explored separately. In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called "CoaCor"), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly.(1)
引用
收藏
页码:2203 / 2214
页数:12
相关论文
共 50 条
  • [31] Quantum error correction for the toric code using deep reinforcement learning
    Andreasson, Philip
    Johansson, Joel
    Liljestrand, Simon
    Granath, Mats
    QUANTUM, 2019, 3
  • [32] Improving Automatic Source Code Summarization via Deep Reinforcement Learning
    Wan, Yao
    Zhao, Zhou
    Yang, Min
    Xu, Guandong
    Ying, Haochao
    Wu, Jian
    Yu, Philip S.
    PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 397 - 407
  • [33] The Open-Source TEXPLORE Code Release for Reinforcement Learning on Robots
    Hester, Todd
    Stone, Peter
    ROBOCUP 2013: ROBOT WORLD CUP XVII, 2014, 8371 : 536 - 543
  • [34] Scheduling straight-line code using reinforcement learning and rollouts
    McGovern, A
    Moss, E
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 903 - 909
  • [35] concept2code: Deep Reinforcement Learning for Conversational AI
    Sonie, Omprakash
    Chakraborty, Abir
    Mullick, Ankan
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4826 - 4827
  • [36] Prescriptive procedure for manual code smell annotation
    Prokic, Simona
    Luburic, Nikola
    Slivka, Jelena
    Kovacevic, Aleksandar
    SCIENCE OF COMPUTER PROGRAMMING, 2024, 238
  • [37] ON PETRARCA RAUCITAS (AND SONORITAS) : AN ANNOTATION ON THE CODE OF THE ABBOZZI
    Di Silvestro, Antonio
    RIVISTA DI LETTERATURA ITALIANA, 2019, 37 (01) : 21 - 31
  • [38] Leveraging Comment Retrieval for Code Summarization
    Hou, Shifu
    Chen, Lingwei
    Ju, Mingxuan
    Ye, Yanfang
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 439 - 447
  • [39] Code Verification Hashing for Image Retrieval
    Chen, Yinqi
    Lu, Zhiyi
    Lu, Ya
    Zhen, Yangting
    Li, Peiwen
    Kang, Shuo
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2531 - 2536
  • [40] Code retrieval via undercover multiplexing
    Barrera, John Fredy
    Henao, Rodrigo
    Tebaldi, Myrian
    Torroba, Roberto
    Bolognini, Nestor
    OPTIK, 2008, 119 (03): : 139 - 142