Associating Natural Language Comment and Source Code Entities

被引:0
|
作者
Panthaplackel, Sheena [1 ]
Gligoric, Milos [2 ]
Mooney, Raymond J. [1 ]
Li, Junyi Jessy [3 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[3] Univ Texas Austin, Dept Linguist, Austin, TX 78712 USA
来源
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.
引用
收藏
页码:8592 / 8599
页数:8
相关论文
共 50 条
  • [1] Bimodal Modelling of Source Code and Natural Language
    Allamanis, Miltiadis
    Tarlow, Daniel
    Gordon, Andrew D.
    Wei, Yi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2123 - 2132
  • [2] A language-independent approach to the extraction of dependencies between source code entities
    Savic, Milos
    Rakic, Gordana
    Budimac, Zoran
    Ivanovic, Mirjana
    INFORMATION AND SOFTWARE TECHNOLOGY, 2014, 56 (10) : 1268 - 1288
  • [3] From source code identifiers to natural language terms
    Carvalho, Nuno Ramos
    Almeida, Jose Joao
    Henriques, Pedro Rangel
    Varanda, Maria Joao
    JOURNAL OF SYSTEMS AND SOFTWARE, 2015, 100 : 117 - 128
  • [4] Natural Language to Python Source Code using Transformers
    Shah, Meet
    Shenoy, Rajat
    Shankarmani, Radha
    2021 International Conference on Intelligent Technologies, CONIT 2021, 2021,
  • [5] Source code authorship approaches natural language processing
    Petrik, Juraj
    Chuda, Daniela
    COMPUTER SYSTEMS AND TECHNOLOGIES (COMPSYSTECH'18), 2018, 1641 : 58 - 61
  • [6] Verifiable source code documentation in controlled natural language
    Kuhn, Tobias
    Bergel, Alexandre
    SCIENCE OF COMPUTER PROGRAMMING, 2014, 96 : 121 - 140
  • [7] Natural Language Parsing for Fact Extraction from Source Code
    Nilsson, Jens
    Lowe, Welf
    Hall, Johan
    Nivre, Joakim
    ICPC: 2009 IEEE 17TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, 2009, : 223 - 227
  • [8] Predictive Mutation Analysis via the Natural Language Channel in Source Code
    Kim, Jinhan
    Jeon, Juyoung
    Hong, Shin
    Yoo, Shin
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (04)
  • [9] RTFM: Towards Understanding Source Code using Natural Language Processing
    Galanis, Maximilian
    Dietrich, Vincent
    Kast, Bernd
    Fiegert, Michael
    ICINCO: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, 2020, : 430 - 437
  • [10] CODITT5: Pretraining for Source Code and Natural Language Editing
    Zhang, Jiyang
    Panthaplackel, Sheena
    Nie, Pengyu
    Li, Junyi Jessy
    Gligoric, Milos
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,