Automatic Fixation of Decompilation Quirks Using Pre-trained Language Model

被引:0
|
作者
Kaichi, Ryunosuke [1 ]
Matsumoto, Shinsuke [1 ]
Kusumoto, Shinji [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Osaka, Japan
关键词
decompiler; fine-tuning; deep learning; quirk; grammatical error correction;
D O I
10.1007/978-3-031-49266-2_18
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Decompiler is a system for recovering the original code from bytecode. A critical challenge in decompilers is that the decompiled code contains differences from the original code. These differences not only reduce the readability of the source code but may also change the program's behavior. In this study, we propose a deep learning-based quirk fixation method that adopts grammatical error correction. One advantage of the proposed method is that it can be applied to any decompiler and programming language. Our experimental results show that the proposed method removes 55% of identifier quirks and 91% of structural quirks. In some cases, however, the proposed method injected a small amount of new quirks.
引用
收藏
页码:259 / 266
页数:8
相关论文
共 50 条
  • [21] Automatic TNM staging of colorectal cancer radiology reports using pre-trained language models
    Chizhikova, Mariia
    Lopez-ubeda, Pilar
    Martin-Noguerol, Teodoro
    Diaz-Galiano, Manuel C.
    Urena-Lopez, L. Alfonso
    Luna, Antonio
    Martin-Valdivia, M. Teresa
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 259
  • [22] A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model
    Hu, Panwen
    Xiao, Nan
    Li, Feifei
    Chen, Yongquan
    Huang, Rui
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6441 - 6450
  • [23] SsciBERT: a pre-trained language model for social science texts
    Si Shen
    Jiangfeng Liu
    Litao Lin
    Ying Huang
    Lin Zhang
    Chang Liu
    Yutong Feng
    Dongbo Wang
    Scientometrics, 2023, 128 : 1241 - 1263
  • [24] A Pre-trained Clinical Language Model for Acute Kidney Injury
    Mao, Chengsheng
    Yao, Liang
    Luo, Yuan
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 531 - 532
  • [25] Few-Shot NLG with Pre-Trained Language Model
    Chen, Zhiyu
    Eavani, Harini
    Chen, Wenhu
    Liu, Yinyin
    Wang, William Yang
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 183 - 190
  • [26] Knowledge Enhanced Pre-trained Language Model for Product Summarization
    Yin, Wenbo
    Ren, Junxiang
    Wu, Yuejiao
    Song, Ruilin
    Liu, Lang
    Cheng, Zhen
    Wang, Sibo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II, 2022, 13552 : 263 - 273
  • [27] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [28] ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence
    Hu, Yibo
    Hosseini, MohammadSaleh
    Parolin, Erick Skorupa
    Osorio, Javier
    Khan, Latifur
    Brandt, Patrick T.
    D'Orazio, Vito J.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5469 - 5482
  • [29] IndicBART: A Pre-trained Model for Indic Natural Language Generation
    Dabre, Raj
    Shrotriya, Himani
    Kunchukuttan, Anoop
    Puduppully, Ratish
    Khapra, Mitesh M.
    Kumar, Pratyush
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1849 - 1863
  • [30] Pre-trained Language Model based Ranking in Baidu Search
    Zou, Lixin
    Zhang, Shengqiang
    Cai, Hengyi
    Ma, Dehong
    Cheng, Suqi
    Wang, Shuaiqiang
    Shi, Daiting
    Cheng, Zhicong
    Yin, Dawei
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4014 - 4022