Text2PyCode: Machine Translation of Natural Language Intent to Python']Python Source Code

被引:1
|
作者
Bonthu, Sridevi [1 ,3 ]
Sree, S. Rama [2 ]
Prasad, M. H. M. Krishna [3 ]
机构
[1] Vishnu Inst Technol, Bhimavaram, Andhra Pradesh, India
[2] Aditya Engn Coll, Surampalem, Andhra Pradesh, India
[3] Jawaharlal Nehru Technol Univ, Kakinada, Andhra Pradesh, India
来源
MACHINE LEARNING AND KNOWLEDGE EXTRACTION (CD-MAKE 2021) | 2021年 / 12844卷
关键词
Deep learning; Attention; Transformer; Code generation; !text type='Python']Python[!/text; Neural machine translation;
D O I
10.1007/978-3-030-84060-0_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing has improved tremendously with the success of Deep Learning. Neural Machine Translation (NMT) has arisen as the most powerful with the power of Deep Learning. The same idea has been recently applied to source code. Code Generation (CG) is the task of generating source code from natural language input. This paper introduces a Python parallel corpus of natural language intent and source code pairs. It also proposes a Code Generation model based on Transformer architecture used for NMT by using code tokenization and code embeddings on the custom parallel corpus. The proposed architecture achieved a good BLEU score of 32.4 and Rouge-L of 82.1, which is on par with natural language translation.
引用
收藏
页码:51 / 60
页数:10
相关论文
共 19 条
  • [1] Natural Language to Python Source Code using Transformers
    Shah, Meet
    Shenoy, Rajat
    Shankarmani, Radha
    2021 International Conference on Intelligent Technologies, CONIT 2021, 2021,
  • [2] PYMT5: multi-mode translation of natural language and PYTHON']PYTHON code with transformers
    Clement, Colin B.
    Drain, Dawn
    Timcheck, Jonathan
    Svyatkovskiy, Alexey
    Sundaresan, Neel
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9052 - 9065
  • [3] Machine Learning Techniques For Python']Python Source Code Vulnerability Detection
    Farasat, Talaya
    Posegga, Joachim
    PROCEEDINGS OF THE FOURTEENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2024, 2024, : 151 - 153
  • [4] Transformers based Python']Python Code Generation from Natural Language
    Swathi, Smt E.
    Vanga, Abhinav Reddy
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [5] PyMLDA: A Python']Python open-source code for Machine Learning Damage Assessment
    Coelho, Jefferson da Silva
    Machado, Marcela Rodrigues
    de Sousa, Amanda Aryda S. R.
    SOFTWARE IMPACTS, 2024, 19
  • [6] Is the Corpus Ready for Machine Translation? A Case Study with Python']Python to Pseudo-Code Corpus
    Rai, Sawan
    Belwal, Ramesh Chandra
    Gupta, Atul
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 1845 - 1858
  • [7] CAMeL Tools: An Open Source Python']Python Toolkit for Arabic Natural Language Processing
    Obeid, Ossama
    Zalmout, Nasser
    Khalifa, Salam
    Taji, Dima
    Oudah, Mai
    Alhafni, Bashar
    Inoue, Go
    Eryani, Fadhl
    Erdmann, Alexander
    Habash, Nizar
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 7022 - 7032
  • [8] Code2graph: Automatic Generation of Static Call Graphs for Python']Python Source Code
    Gharibi, Gharib
    Tripathi, Rashmi
    Lee, Yugyung
    PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 880 - 883
  • [9] Seq2Code: Transformer-Based Encoder-Decoder Model for Python']Python Source Code Generation
    Laskari, Naveen Kumar
    Reddy, K. Adi Narayana
    Reddy, M. Indrasena
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 301 - 309
  • [10] Is the Corpus Ready for Machine Translation? A Case Study with Python to Pseudo-Code Corpus
    Sawan Rai
    Ramesh Chandra Belwal
    Atul Gupta
    Arabian Journal for Science and Engineering, 2023, 48 : 1845 - 1858