CORES: COde REpresentation Summarization for Code Search

被引:0
|
作者
Zhang, Xu [1 ]
Hu, Xiaoyu [1 ]
Zhou, Deyu [1 ]
机构
[1] Southeast Univ, Sch Comp Sci & Engn, Key Lab New Generat Artificial Intelligence Techno, Minist Educ, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Codes; Semantics; Feature extraction; Vectors; Redundancy; Training; Software development management; Code search; code representation; summarization;
D O I
10.1109/TCE.2024.3445139
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the growth of the consumer electronics market, the software development industry is facing new opportunities and an increased focus on code retrieval techniques to improve efficiency and reduce costs. Code search aims to retrieve and reuse code from extensive repositories based on a search query with specific requirements. Recently, pre-trained model-based approaches have become popular because of grasping semantic representations of code snippets and search queries accurately. However, such approaches ignore the inconsistency between code and query statements due to the redundant tokens, such as definitions and punctuation marks in the code snippets, which hinder the matching accuracy. To tackle such disadvantage, in this paper, two strategies are proposed based on explicit or implicit code representation summarization. By summarizing the code representation, the redundancy in the code is removed and the inconsistency between code and query statements is alleviated. For the explicit code representation summarization-based strategy, different views of contextual information are obtained and summarized based on different scales of pyramidal dilated convolution. As to the implicit code representation summarization-based strategy, covariance is directly applied to constrain the code representation to ensure de-redundancy. Experimental results on six benchmark datasets show both strategies outperform the current State-Of-The-Art model CORES by 1.2% on average MRR scores.
引用
收藏
页码:6095 / 6104
页数:10
相关论文
共 50 条
  • [1] Learning a holistic and comprehensive code representation for code summarization
    Yang, Kaiyuan
    Wang, Junfeng
    Song, Zihua
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 203
  • [2] Multimodal Representation for Neural Code Search
    Gu, Jian
    Chen, Zimin
    Monperrus, Martin
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, : 483 - 494
  • [3] Code Generation as a Dual Task of Code Summarization
    Wei, Bolin
    Li, Ge
    Xia, Xin
    Fu, Zhiyi
    Jin, Zhi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Advances in Code Summarization
    Desai, Utkarsh
    Sridhara, Giriprasad
    Tamilselvam, Srikanth
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 330 - 331
  • [5] Interpretable Code Summarization
    Kamal, Md Sarwar
    Nimmy, Sonia Farhana
    Dey, Nilanjan
    IEEE TRANSACTIONS ON RELIABILITY, 2024, : 1 - 10
  • [6] Code Search Method based on Multimodal Representation
    Chen, Xiao
    Wu, Junhua
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY COMPANION, QRS-C, 2022, : 485 - 491
  • [7] On the Evaluation of Neural Code Summarization
    Shi, Ensheng
    Wang, Yanlin
    Du, Lun
    Chen, Junjie
    Han, Shi
    Zhang, Hongyu
    Zhang, Dongmei
    Sun, Hongbin
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1597 - 1608
  • [8] A Timeline Summarization of Code Changes
    Decker, Michael J.
    Newman, Christian D.
    Collard, Michael L.
    Guarnera, Drew T.
    Maletic, Jonathan, I
    2018 IEEE THIRD INTERNATIONAL WORKSHOP ON DYNAMIC SOFTWARE DOCUMENTATION (DYSDOC3), 2018, : 9 - 10
  • [9] Autofolding for Source Code Summarization
    Fowkes, Jaroslav
    Chanthirasegaran, Pankajan
    Ranca, Razvan
    Allamanis, Miltiadis
    Lapata, Mirella
    Sutton, Charles
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2017, 43 (12) : 1095 - 1109
  • [10] Code Structure-Guided Transformer for Source Code Summarization
    Gao, Shuzheng
    Gao, Cuiyun
    He, Yulan
    Zeng, Jichuan
    Nie, Lunyiu
    Xia, Xin
    Lyu, Michael
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (01)