Automatic Fixation of Decompilation Quirks Using Pre-trained Language Model

被引：0

作者：

Kaichi, Ryunosuke ^{[1
]}

Matsumoto, Shinsuke ^{[1
]}

Kusumoto, Shinji ^{[1
]}

机构：

[1] Osaka Univ, Grad Sch Informat Sci & Technol, Osaka, Japan

来源：

PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROFES 2023, PT I | 2024年 / 14483卷

关键词：

decompiler; fine-tuning; deep learning; quirk; grammatical error correction;

D O I：

10.1007/978-3-031-49266-2_18

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Decompiler is a system for recovering the original code from bytecode. A critical challenge in decompilers is that the decompiled code contains differences from the original code. These differences not only reduce the readability of the source code but may also change the program's behavior. In this study, we propose a deep learning-based quirk fixation method that adopts grammatical error correction. One advantage of the proposed method is that it can be applied to any decompiler and programming language. Our experimental results show that the proposed method removes 55% of identifier quirks and 91% of structural quirks. In some cases, however, the proposed method injected a small amount of new quirks.

引用

页码：259 / 266

页数：8

共 50 条

[31] Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Shon, Suwon
Brusco, Pablo
Pan, Jing
Han, Kyu J.
Watanabe, Shinji
INTERSPEECH 2021, 2021, : 3420 - 3424
[32] Software Vulnerabilities Detection Based on a Pre-trained Language Model
Xu, Wenlin
Li, Tong
Wang, Jinsong
Duan, Haibo
Tang, Yahui
2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
[33] AraXLNet: pre-trained language model for sentiment analysis of Arabic
Alduailej, Alhanouf
Alothaim, Abdulrahman
JOURNAL OF BIG DATA, 2022, 9 (01)
[34] A survey of text classification based on pre-trained language model
Wu, Yujia
Wan, Jun
NEUROCOMPUTING, 2025, 616
[35] Integrating Pre-Trained Language Model With Physical Layer Communications
Lee, Ju-Hyung
Lee, Dong-Ho
Lee, Joohan
Pujara, Jay
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (11) : 17266 - 17278
[36] SsciBERT: a pre-trained language model for social science texts
Shen, Si
Liu, Jiangfeng
Lin, Litao
Huang, Ying
Zhang, Lin
Liu, Chang
Feng, Yutong
Wang, Dongbo
SCIENTOMETRICS, 2023, 128 (02) : 1241 - 1263
[37] Interpretability of Entity Matching Based on Pre-trained Language Model
Liang Z.
Wang H.-Z.
Dai J.-J.
Shao X.-Y.
Ding X.-O.
Mu T.-Y.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (03): : 1087 - 1108
[38] TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
Yang, Ziqing
Cui, Yiming
Chen, Zhigang
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2022, : 35 - 43
[39] Learning and Evaluating a Differentially Private Pre-trained Language Model
Hoory, Shlomo
Feder, Amir
Tendler, Avichai
Cohen, Alon
Erell, Sofia
Laish, Itay
Nakhost, Hootan
Stemmer, Uri
Benjamini, Ayelet
Hassidim, Avinatan
Matias, Yossi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1178 - 1189
[40] Idiom Cloze Algorithm Integrating with Pre-trained Language Model
Ju S.-G.
Huang F.-Y.
Sun J.-P.
Ruan Jian Xue Bao/Journal of Software, 2022, 33 (10): : 3793 - 3805

← 1 2 3 4 5 →