DIRECT : A Transformer-based Model for Decompiled Variable Name Recovery

被引：0

作者：

Nitin, Vikram ^{[1
]}

Saieva, Anthony ^{[1
]}

Ray, Baishakhi ^{[1
]}

Kaiser, Gail ^{[1
]}

机构：

[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA

来源：

NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Decompiling binary executables to high-level code is an important step in reverse engineering scenarios, such as malware analysis and legacy code maintenance. However, the generated high-level code is difficult to understand since the original variable names are lost. In this paper, we leverage transformer models to reconstruct the original variable names from decompiled code. Inherent differences between code and natural language present certain challenges in applying conventional transformer-based architectures to variable name recovery. We propose DIRECT, a novel transformer-based architecture customized specifically for the task at hand. We evaluate our model on a dataset of decompiled functions and find that DIRECT outperforms the previous state-of-the-art model by up to 20%. We also present ablation studies evaluating the impact of each of our modifications. We make the source code of DIRECT available to encourage reproducible research.

引用

页码：48 / 57

页数：10

共 50 条

[21] DLGNet: A Transformer-based Model for Dialogue Response Generation
Olabiyi, Oluwatobi
Mueller, Erik T.
NLP FOR CONVERSATIONAL AI, 2020, : 54 - 62
[22] AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION
Zhao, Yifei
Champagne, Benoit
2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
[23] TRANSQL: A Transformer-based Model for Classifying SQL Queries
Tahmasebi, Shirin
Payberah, Amir H.
Paragraph, Ahmet Soylu
Roman, Dumitru
Matskin, Mihhail
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 788 - 793
[24] Learning Daily Human Mobility with a Transformer-Based Model
Wang, Weiying
Osaragi, Toshihiro
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (02)
[25] A Transformer-based Audio Captioning Model with Keyword Estimation
Koizumi, Yuma
Masumura, Ryo
Nishida, Kyosuke
Yasuda, Masahiro
Saito, Shoichiro
INTERSPEECH 2020, 2020, : 1977 - 1981
[26] Transformer-based heart language model with electrocardiogram annotations
Tudjarski, Stojancho
Gusev, Marjan
Kanoulas, Evangelos
SCIENTIFIC REPORTS, 2025, 15 (01):
[27] LVBERT: Transformer-Based Model for Latvian Language Understanding
Znotins, Arturs
Barzdins, Guntis
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 111 - 115
[28] An Improved Transformer-Based Model for Urban Pedestrian Detection
Wu, Tianyong
Li, Xiang
Dong, Qiuxuan
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2025, 18 (01)
[29] Predicting the formation of NADES using a transformer-based model
Ayres, Lucas B.
Gomez, Federico J. V.
Silva, Maria Fernanda
Linton, Jeb R.
Garcia, Carlos D.
SCIENTIFIC REPORTS, 2024, 14 (01)
[30] Transformer-based code model with compressed hierarchy representation
Zhang, Kechi
Li, Jia
Li, Zhuo
Jin, Zhi
Li, Ge
EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (02)

← 1 2 3 4 5 →