DIRECT : A Transformer-based Model for Decompiled Variable Name Recovery

被引:0
|
作者
Nitin, Vikram [1 ]
Saieva, Anthony [1 ]
Ray, Baishakhi [1 ]
Kaiser, Gail [1 ]
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
来源
NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decompiling binary executables to high-level code is an important step in reverse engineering scenarios, such as malware analysis and legacy code maintenance. However, the generated high-level code is difficult to understand since the original variable names are lost. In this paper, we leverage transformer models to reconstruct the original variable names from decompiled code. Inherent differences between code and natural language present certain challenges in applying conventional transformer-based architectures to variable name recovery. We propose DIRECT, a novel transformer-based architecture customized specifically for the task at hand. We evaluate our model on a dataset of decompiled functions and find that DIRECT outperforms the previous state-of-the-art model by up to 20%. We also present ablation studies evaluating the impact of each of our modifications. We make the source code of DIRECT available to encourage reproducible research.
引用
收藏
页码:48 / 57
页数:10
相关论文
共 50 条
  • [31] A Transformer-based Medical Visual Question Answering Model
    Liu, Lei
    Su, Xiangdong
    Guo, Hui
    Zhu, Daobin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1712 - 1718
  • [32] A Transformer-based Embedding Model for Personalized Product Search
    Bi, Keping
    Ai, Qingyao
    Croft, W. Bruce
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1521 - 1524
  • [33] An ensemble transformer-based model for Arabic sentiment analysis
    Mohamed, Omar
    Kassem, Aly M. M.
    Ashraf, Ali
    Jamal, Salma
    Mohamed, Ensaf Hussein
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 13 (01)
  • [34] LayoutDM: Transformer-based Diffusion Model for Layout Generation
    Chai, Shang
    Zhuang, Liansheng
    Yan, Fengying
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18349 - 18358
  • [35] An ensemble transformer-based model for Arabic sentiment analysis
    Omar Mohamed
    Aly M. Kassem
    Ali Ashraf
    Salma Jamal
    Ensaf Hussein Mohamed
    Social Network Analysis and Mining, 13
  • [36] Transformer-based power system energy prediction model
    Rao, Zhuyi
    Zhang, Yunxiang
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 913 - 917
  • [37] ParsBERT: Transformer-based Model for Persian Language Understanding
    Mehrdad Farahani
    Mohammad Gharachorloo
    Marzieh Farahani
    Mohammad Manthouri
    Neural Processing Letters, 2021, 53 : 3831 - 3847
  • [38] DeepReducer: A linear transformer-based model for MEG denoising
    Xu, Hui
    Zheng, Li
    Liao, Pan
    Lyu, Bingjiang
    Gao, Jia-Hong
    NEUROIMAGE, 2025, 308
  • [39] ParsBERT: Transformer-based Model for Persian Language Understanding
    Farahani, Mehrdad
    Gharachorloo, Mohammad
    Farahani, Marzieh
    Manthouri, Mohammad
    NEURAL PROCESSING LETTERS, 2021, 53 (06) : 3831 - 3847
  • [40] Generating Music Transition by Using a Transformer-Based Model
    Hsu, Jia-Lien
    Chang, Shuh-Jiun
    ELECTRONICS, 2021, 10 (18)