Code Summarization with Abstract Syntax Tree

被引:8
|
作者
Chen, Qiuyuan [1 ]
Hu, Han [2 ]
Liu, Zhaoyi [3 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Tsinghua Univ, Sch Software, Beijing, Peoples R China
[3] Peking Univ, Sch Shenzhen Grad, Shenzhen 518055, Peoples R China
关键词
Code summarization; Code clone; Code representation;
D O I
10.1007/978-3-030-36802-9_69
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code summarization, which provides a high-level description of the function implemented by code, plays a vital role in software maintenance and code retrieval. Traditional approaches focus on retrieving similar code snippets to generate summaries, and recently researchers pay increasing attention to leverage deep learning approaches, especially the encoder-decoder framework. Approaches based on encoder-decoder suffer from two drawbacks: (a) Lack of summarization in functionality level; (b) Code snippets are always too long (more than ten words), regular encoders perform poorly. In this paper, we propose a novel code representation with the help of Abstract Syntax Trees, which could describe the functionality of code snippets and shortens the length of inputs. Based on our proposed code representation, we develop Generative Task, which aims to generate summary sentences of code snippets. Experiments on large-scale real-world industrial Java projects indicate that our approaches are effective and outperform the state-of-the-art approaches in code summarization.
引用
收藏
页码:652 / 660
页数:9
相关论文
共 50 条
  • [1] Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting
    Lin, Chen
    Ouyang, Zhichao
    Zhuang, Junqing
    Chen, Jianqiang
    Li, Hui
    Wu, Rongxin
    2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, : 184 - 195
  • [2] Static code detection based on abstract syntax tree
    Lu, Xiaofeng
    Fang, Denghui
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 195 - 195
  • [3] Source Code Pattern as Anchored Abstract Syntax Tree
    Nakayama, Ken
    Sakai, Eko
    2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 170 - 173
  • [4] CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees
    Shi, Ensheng
    Wang, Yanlin
    Du, Lun
    Zhang, Hongyu
    Han, Shi
    Zhang, Dongmei
    Sun, Hongbin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4053 - 4062
  • [5] AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization
    Tang, Ze
    Li, Chuanyi
    Ge, Jidong
    Shen, Xiaoyu
    Zhu, Zheling
    Luo, Bin
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1193 - 1195
  • [6] PassSum: Leveraging paths of abstract syntax trees and self-supervision for code summarization
    Niu, Changan
    Li, Chuanyi
    Ng, Vincent
    Ge, Jidong
    Huang, Liguo
    Luo, Bin
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 36 (06)
  • [7] Source Code Plagiarism Detection Based on Abstract Syntax Tree Fingerprintings
    Suttichaya, Vasin
    Eakvorachai, Niracha
    Lurkraisit, Tunchanok
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [8] Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
    Song, Yewei
    Lothritz, Cedric
    Tang, Daniel
    Bissyandé, Tegawendé F.
    Klein, Jacques
    arXiv,
  • [9] The Metric for Automatic Code Generation Based on Dynamic Abstract Syntax Tree
    Yao, Wenjun
    Jiang, Ying
    Yang, Yang
    INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2023, 15 (01)
  • [10] Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
    Song, Yewei
    Lothritz, Cedric
    Tang, Daniel
    Bissyande, Tegawende F.
    Klein, Jacques
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 38 - 46