DIRECT : A Transformer-based Model for Decompiled Variable Name Recovery

被引:0
|
作者
Nitin, Vikram [1 ]
Saieva, Anthony [1 ]
Ray, Baishakhi [1 ]
Kaiser, Gail [1 ]
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
来源
NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decompiling binary executables to high-level code is an important step in reverse engineering scenarios, such as malware analysis and legacy code maintenance. However, the generated high-level code is difficult to understand since the original variable names are lost. In this paper, we leverage transformer models to reconstruct the original variable names from decompiled code. Inherent differences between code and natural language present certain challenges in applying conventional transformer-based architectures to variable name recovery. We propose DIRECT, a novel transformer-based architecture customized specifically for the task at hand. We evaluate our model on a dataset of decompiled functions and find that DIRECT outperforms the previous state-of-the-art model by up to 20%. We also present ablation studies evaluating the impact of each of our modifications. We make the source code of DIRECT available to encourage reproducible research.
引用
收藏
页码:48 / 57
页数:10
相关论文
共 50 条
  • [21] DLGNet: A Transformer-based Model for Dialogue Response Generation
    Olabiyi, Oluwatobi
    Mueller, Erik T.
    NLP FOR CONVERSATIONAL AI, 2020, : 54 - 62
  • [22] AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION
    Zhao, Yifei
    Champagne, Benoit
    2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
  • [23] TRANSQL: A Transformer-based Model for Classifying SQL Queries
    Tahmasebi, Shirin
    Payberah, Amir H.
    Paragraph, Ahmet Soylu
    Roman, Dumitru
    Matskin, Mihhail
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 788 - 793
  • [24] Learning Daily Human Mobility with a Transformer-Based Model
    Wang, Weiying
    Osaragi, Toshihiro
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (02)
  • [25] A Transformer-based Audio Captioning Model with Keyword Estimation
    Koizumi, Yuma
    Masumura, Ryo
    Nishida, Kyosuke
    Yasuda, Masahiro
    Saito, Shoichiro
    INTERSPEECH 2020, 2020, : 1977 - 1981
  • [26] Transformer-based heart language model with electrocardiogram annotations
    Tudjarski, Stojancho
    Gusev, Marjan
    Kanoulas, Evangelos
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [27] LVBERT: Transformer-Based Model for Latvian Language Understanding
    Znotins, Arturs
    Barzdins, Guntis
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 111 - 115
  • [28] An Improved Transformer-Based Model for Urban Pedestrian Detection
    Wu, Tianyong
    Li, Xiang
    Dong, Qiuxuan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2025, 18 (01)
  • [29] Predicting the formation of NADES using a transformer-based model
    Ayres, Lucas B.
    Gomez, Federico J. V.
    Silva, Maria Fernanda
    Linton, Jeb R.
    Garcia, Carlos D.
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [30] Transformer-based code model with compressed hierarchy representation
    Zhang, Kechi
    Li, Jia
    Li, Zhuo
    Jin, Zhi
    Li, Ge
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (02)