Source Code Classification Using Neural Networks

被引:0
|
作者
Gilda, Shlok [1 ]
机构
[1] Pune Inst Comp Technol, Dept Comp Engn, Pune, Maharashtra, India
关键词
Artificial neural network; Multi-layer neural network; Supervised learning; Feature extraction;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Programming languages are the primary tools of the software development industry. As of today, the programming language of the vast majority of the published source code is manually specified or programmatically assigned based solely on the respective file extension. This work shows that the identification of the programming language can be done automatically by utilizing an artificial neural network based on supervised learning and intelligent feature extraction from the source code files. We employ a multi-layer neural network - word embedding layers along with a Convolutional Neural Network - to achieve this goal. Our criteria for an automatic source code identification solution include high accuracy, fast performance, and large programming language coverage. The model achieves a 97% accuracy rate while classifying 60 programming languages.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] LDPC Code Classification using Convolutional Neural Networks
    Comar, Bradley
    2020 29TH WIRELESS AND OPTICAL COMMUNICATIONS CONFERENCE (WOCC), 2020, : 115 - 120
  • [2] Improving code readability classification using convolutional neural networks
    Mi, Qing
    Keung, Jacky
    Xiao, Yan
    Mensah, Solomon
    Gao, Yujin
    INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 104 : 60 - 71
  • [3] Source Code Authorship Identification Using Deep Neural Networks
    Kurtukova, Anna
    Romanov, Aleksandr
    Shelupanov, Alexander
    SYMMETRY-BASEL, 2020, 12 (12): : 1 - 17
  • [4] Handwritten ZIP code Classification Using Artificial Neural Networks
    Kumar, K. Siva
    Devi, D. Uma
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (08): : 77 - 81
  • [5] Unsupervised Classifying of Software Source Code Using Graph Neural Networks
    Vytovtov, Petr
    Chuvilin, Kirill
    PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 518 - 524
  • [6] Source code defect detection using deep convolutional neural networks
    Wang, Xiaomeng
    Guan, Zhibin
    Xin, Wei
    Wang, Jiajie
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (11): : 1267 - 1272
  • [7] Neural Comment Generation for Source Code with Auxiliary Code Classification Task
    Chen, Minghao
    Wan, Xiaojun
    2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 522 - 529
  • [8] On the Embeddings of Variables in Recurrent Neural Networks for Source Code
    Chirkova, Nadezhda
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2679 - 2689
  • [9] Convolutional Neural Networks for Classification of Malware Assembly Code
    Gibert, Daniel
    Bejar, Javier
    Mateu, Carles
    Planes, Jordi
    Solis, Daniel
    Vicens, Ramon
    RECENT ADVANCES IN ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2017, 300 : 221 - 226
  • [10] Exploration of Convolutional Neural Network models for source code classification
    Barchi, Francesco
    Parisi, Emanuele
    Urgese, Gianvito
    Ficarra, Elisa
    Acquaviva, Andrea
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 97 (97)