DIRECT : A Transformer-based Model for Decompiled Variable Name Recovery

被引：0

作者：

Nitin, Vikram ^{[1
]}

Saieva, Anthony ^{[1
]}

Ray, Baishakhi ^{[1
]}

Kaiser, Gail ^{[1
]}

机构：

[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA

来源：

NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Decompiling binary executables to high-level code is an important step in reverse engineering scenarios, such as malware analysis and legacy code maintenance. However, the generated high-level code is difficult to understand since the original variable names are lost. In this paper, we leverage transformer models to reconstruct the original variable names from decompiled code. Inherent differences between code and natural language present certain challenges in applying conventional transformer-based architectures to variable name recovery. We propose DIRECT, a novel transformer-based architecture customized specifically for the task at hand. We evaluate our model on a dataset of decompiled functions and find that DIRECT outperforms the previous state-of-the-art model by up to 20%. We also present ablation studies evaluating the impact of each of our modifications. We make the source code of DIRECT available to encourage reproducible research.

引用

页码：48 / 57

页数：10

共 50 条

[1] Transformer-based approach to variable typing
Rey, Charles Arthel
Danguilan, Jose Lorenzo
Mendoza, Karl Patrick
Remolona, Miguel Francisco
HELIYON, 2023, 9 (10)
[2] Transformer-Based Direct Hidden Markov Model for Machine Translation
Wang, Weiyue
Yang, Zijian
Gao, Yingbo
Ney, Hermann
ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 23 - 32
[3] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gallego, Gerard, I
Alastruey, Belen
Costa-Jussa, Marta R.
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 277 - 284
[4] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gállego, Gerard I.
Alastruey, Belen
Costa-Jussà, Marta R.
NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop, 2022, : 277 - 284
[5] A Transformer-based Function Symbol Name Inference Model from an Assembly Language for Binary Reversing
Kim, HyunJin
Bak, JinYeong
Cho, Kyunghyun
Koo, Hyungjoon
PROCEEDINGS OF THE 2023 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, ASIA CCS 2023, 2023, : 951 - 965
[6] Direct conversion of peptides into diverse peptidomimetics using a transformer-based chemical language model
Yoshimori, Atsushi
Bajorath, Juergen
EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY REPORTS, 2025, 13
[7] Transformer-based Image Compression with Variable Image Quality Objectives
Kao, Chia-Hao
Chen, Yi-Hsin
Chien, Cheng
Chiu, Wei-Chen
Peng, Wen-Hsiao
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1718 - 1725
[8] Smart Transformer-based Frequency Support in Variable Inertia Conditions
Langwasser, Marius
De Carne, Giovanni
Liserre, Marco
2019 IEEE 13TH INTERNATIONAL CONFERENCE ON COMPATIBILITY, POWER ELECTRONICS AND POWER ENGINEERING (CPE-POWERENG), 2019,
[9] Variable speed digital hydraulic transformer-based servo drive
Linjama, Matti
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2020, 234 (03) : 287 - 298
[10] Vision Transformer-Based Photovoltaic Prediction Model
Kang, Zaohui
Xue, Jizhong
Lai, Chun Sing
Wang, Yu
Yuan, Haoliang
Xu, Fangyuan
ENERGIES, 2023, 16 (12)

← 1 2 3 4 5 →