HEBCS: A High-Efficiency Binary Code Search Method

被引:2
|
作者
Sun, Xiangjie [1 ,2 ]
Wei, Qiang [2 ]
Du, Jiang [2 ]
Wang, Yisen [2 ]
机构
[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou 450002, Peoples R China
[2] PLA Informat Engn Univ, Sch Cyber Sci & Engn, Zhengzhou 450001, Peoples R China
关键词
binary code search; binary code similarity; locality-sensitive hash; software analysis;
D O I
10.3390/electronics12163464
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Binary code search is a technique that involves finding code with similarity to a given code within a code database. It finds extensive application in scenarios such as vulnerability queries and code defect analysis. While many existing methods employ advanced machine learning models for similarity analysis, their lack of interpretability and low efficiency in dealing with large-scale functions still remain challenges. To address these issues, we propose a high-efficiency binary code search method called HEBCS. It employs an interpretable approach to extract function-level features and transforms each feature into a locality-sensitive hash representation. Then, the hashes of these features are combined to form the hash of the function. By leveraging the pigeonhole principle, HEBCS enables efficient storage and retrieval of functions, ensuring high execution efficiency even in the presence of large-scale data. Furthermore, we compare HEBCS with a classic method and a state-of-the-art method, demonstrating that HEBCS achieves significantly higher search efficiency while maintaining a comparable accuracy, recall and F1-score. In real-world vulnerability query applications, HEBCS demonstrated promising results. Its effectiveness in large-scale binary function searches suggests significant potential for practical applications.
引用
收藏
页数:21
相关论文
共 50 条
  • [11] Implementing a high-efficiency similarity analysis approach for firmware code
    Wang, Yisen
    Wang, Ruimin
    Jing, Jing
    Wang, Huanwei
    PLOS ONE, 2021, 16 (01):
  • [12] A binary system of photoreagents for high-efficiency labeling of DNA polymerases
    Lebedeva, NA
    Kolpashchikov, DM
    Rechkunova, NI
    Khodyreva, SN
    Lavrik, OI
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2001, 287 (02) : 530 - 535
  • [13] RECIPLEX - A HIGH-EFFICIENCY MULTICHANNEL BINARY DATA TRANSMISSION SYSTEM
    ZAKHAROV, IA
    TELECOMMUNICATIONS AND RADIO ENGINEER-USSR, 1967, (02): : 60 - &
  • [14] Characterization of phenol degradation by high-efficiency binary mixed culture
    Zeng, Hong-Yan
    Jiang, He
    Xia, Kui
    Wang, Ya-Ju
    Huang, Yan
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2010, 17 (05) : 1035 - 1044
  • [15] Characterization of phenol degradation by high-efficiency binary mixed culture
    Hong-Yan Zeng
    He Jiang
    Kui Xia
    Ya-Ju Wang
    Yan Huang
    Environmental Science and Pollution Research, 2010, 17 : 1035 - 1044
  • [16] HIGH-EFFICIENCY OF THE INDUSTRY-COMPLEX METHOD
    STOLYAROV, EV
    RIZVANOV, NM
    KAGARMANOV, NF
    NEFTYANOE KHOZYAISTVO, 1982, (05): : 22 - 24
  • [17] High-efficiency plating method for Leishmania infantum
    Quijada, L
    Soto, M
    Alonso, C
    Requena, JM
    MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 2003, 130 (02) : 139 - 141
  • [18] Binary search: Algorithm, code, and caching
    Bentley, J
    DR DOBBS JOURNAL, 2000, 25 (04): : 111 - +
  • [19] Rendezvous: A Search Engine for Binary Code
    Khoo, Wei Ming
    Mycroft, Alan
    Anderson, Ross
    2013 10TH IEEE WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2013, : 329 - 338
  • [20] Voxelization-based high-efficiency mesh generation method for parallel CFD code GASFLOW-MPI
    Yu, Fujiang
    Zhang, Han
    Li, Yabing
    Xiao, Jianjun
    Class, Andreas
    Jordan, Thomas
    ANNALS OF NUCLEAR ENERGY, 2018, 117 : 277 - 289