Multi-grained contextual code representation learning for commit message generation

被引:3
|
作者
Wang, Chuangwei [1 ]
Zhang, Li [1 ]
Zhang, Xiaofang [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Code change; Code representation learning; Commit message generation; Pre-training; COMPLETION;
D O I
10.1016/j.infsof.2023.107393
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Commit messages, precisely describing the code changes for each commit in natural language, makes it possible for developers and succeeding reviewers to understand the code changes without digging into implementation details. However, the semantic and structural gap between code and natural language poses a significant challenge for commit message generation. Several researchers have proposed automated techniques to generate commit messages. Nevertheless, the information about the code is not sufficiently exploited. In this paper, we propose multi-grained contextual code representation learning for commit message generation (COMU). We extract multi-grained information from the changed code at the line and AST levels (i.e., Code_Diff and AST_Diff). In Code_Diff, we construct global contextual semantic information about the changed code, and mark whether a line of code has changed with three different tokens. In AST_Diff, we extract the code structure from source code changes and combine the extracted structure with four types of editing operations to explicitly focus on the detailed information of the changed part. In addition, we build the experimental datasets, since there is still no publicly sufficient dataset for this task. The release of this dataset would contribute to advancing research in this field. We perform an extensive experiment to evaluate the effectiveness of COMU. The experimental evaluation and human study show that our model outperforms the baseline model.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Visual learning graph convolution for multi-grained orange quality grading
    Guan Zhi-bin
    Zhang Yan-qi
    Chai Xiu-juan
    Chai Xin
    Zhang Ning
    Zhang Jian-hua
    Sun Tan
    JOURNAL OF INTEGRATIVE AGRICULTURE, 2023, 22 (01) : 279 - 291
  • [32] A multi-grained aspect vector learning model for unsupervised aspect identification
    Shi, Jinglei
    Guo, Junjun
    Yu, Zhengtao
    Xiang, Yan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (06) : 12075 - 12085
  • [33] Visual learning graph convolution for multi-grained orange quality grading
    GUAN Zhi-bin
    ZHANG Yan-qi
    CHAI Xiu-juan
    CHAI Xin
    ZHANG Ning
    ZHANG Jian-hua
    SUN Tan
    Journal of Integrative Agriculture, 2023, 22 (01) : 279 - 291
  • [34] Video-Text Retrieval by Supervised Sparse Multi-Grained Learning
    Wang, Yimu
    Shi, Peng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 633 - 649
  • [35] Learning multi-grained aspect target sequence for Chinese sentiment analysis
    Peng, Haiyun
    Ma, Yukun
    Li, Yang
    Cambria, Erik
    KNOWLEDGE-BASED SYSTEMS, 2018, 148 : 167 - 176
  • [36] KADEL: Knowledge-Aware Denoising Learning for Commit Message Generation
    Tao, Wei
    Zhou, Yucheng
    Wang, Yanlin
    Zhang, Hongyu
    Wang, Haofen
    Zhang, Wenqiang
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
  • [37] Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
    Zhang, Hao
    Si, Nianwen
    Chen, Yaqi
    Zhang, Wenlin
    Yang, Xukui
    Qu, Dan
    Zhang, Wei-Qiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1075 - 1086
  • [38] Learning Multi-Stage Multi-Grained Semantic Embeddings for E-Commerce Search
    Wang, Binbin
    Li, Mingming
    Zeng, Zhixiong
    Zhuo, Jingwei
    Wang, Songlin
    Xu, Sulong
    Long, Bo
    Yan, Weipeng
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 411 - 415
  • [39] MGICL: Multi-Grained Interaction Contrastive Learning for Multimodal Named Entity Recognition
    Guo, Aibo
    Zhao, Xiang
    Tan, Zhen
    Xiao, Weidong
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 639 - 648
  • [40] MGD-GAN: Text-to-Pedestrian Generation Through Multi-grained Discrimination
    Zhang, Shengyu
    Wang, Donghui
    Zhao, Zhou
    Tang, Siliang
    Kuang, Kun
    Xie, Di
    Wu, Fei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 662 - 673