ConFunc: Enhanced Binary Function-Level Representation through Contrastive Learning

被引：0

作者：

Li, Longfei ^{[1
]}

Yin, Xiaokang ^{[2
]}

Li, Xiao ^{[2
]}

Zhu, Xiaoya ^{[2
]}

Liu, Shengli ^{[2
]}

机构：

[1] Zhengzhou Univ, Zhengzhou, Peoples R China

[2] Informat Engn Univ, Zhengzhou, Peoples R China

来源：

2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023 | 2024年

关键词：

binary code similarity detection; machine learning; contrastive learning; function embeddings;

D O I：

10.1109/TrustCom60117.2023.00169

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Binary code similarity detection (BCSD) has numerous applications, including malware detection, vulnerability search, plagiarism detection, and patch identification. Recent studies have demonstrated that with the rapid progress of machine learning (ML) techniques, various BCSD approaches based on machine learning have exhibited stronger performance than traditional methods. However, current ML-based BCSD approaches tend to ignore the issue of training samples, and most ML-based BCSD approaches are based on supervised learning, which is suffered from the labelling difficulties. To mitigate these issues, we propose ConFunc: a function-level binary code similarity detection framework based on contrastive learning. Performance evaluation shows that ConFunc enhances the Mean Reciprocal Rank (MRR) and Recall rates (Recall@1) of baseline models by fully harnessing the potential of the data. Additionally, ConFunc demonstrates stronger performance in scenarios with scarce data, achieving the baseline model's performance on the entire dataset using only 10% of the complete dataset. In real-world patch identification and vulnerability search tasks, ConFunc consistently outperforms other baseline models in MRR and Recall@10.

引用

页码：1241 / 1248

页数：8

共 50 条

[21] Contrastive learning enhanced by graph neural networks for Universal Multivariate Time Series Representation
College of Artificial Intelligence, Southwest University, Chongqing
400715, China
Inf. Syst.,
[22] Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning
Moradinasab, Nazanin
Sharma, Suchetha
Bar-Yoseph, Ronen
Radom-Aizik, Shlomit
Bilchick, Kenneth C.
Cooper, Dan M.
Weltman, Arthur
Brown, Donald E.
DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (03) : 1493 - 1519
[23] Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning
Nazanin Moradinasab
Suchetha Sharma
Ronen Bar-Yoseph
Shlomit Radom-Aizik
Kenneth C. Bilchick
Dan M. Cooper
Arthur Weltman
Donald E. Brown
Data Mining and Knowledge Discovery, 2024, 38 : 1493 - 1519
[24] Patient-Level Contrastive Learning for Enhanced Biomarker Prediction in Retinal Imaging
Kim, Hyeonmin
Seo, Chanyang
Cho, Yunnie
Yoo, Tae Keun
DATA ENGINEERING IN MEDICAL IMAGING, DEMI 2024, 2025, 15265 : 125 - 133
[25] Enhancing chemical reaction search through contrastive representation learning and human-in-the-loop
Youngchun Kwon
Hyunjeong Jeon
Joonhyuk Choi
Youn-Suk Choi
Seokho Kang
Journal of Cheminformatics, 17 (1)
[26] SynapseCLR: Uncovering features of synapses in primary visual cortex through contrastive representation learning
Wilson, Alyssa
Babadi, Mehrtash
PATTERNS, 2023, 4 (04):
[27] Q-SupCon: Quantum-Enhanced Supervised Contrastive Learning Architecture within the Representation Learning Framework
Don, Asitha kottahachchi kankanamge
Khalil, Ibrahim
ACM TRANSACTIONS ON QUANTUM COMPUTING, 2025, 6 (01):
[28] Dual-channel graph contrastive learning for self-supervised graph-level representation learning
Luo, Zhenfei
Dong, Yixiang
Zheng, Qinghua
Liu, Huan
Luo, Minnan
PATTERN RECOGNITION, 2023, 139
[29] Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning
Liang, Meiyu
Du, Junping
Cao, Xiaowen
Yu, Yang
Lu, Kangkang
Xue, Zhe
Zhang, Min
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
[30] Question-response representation with dual-level contrastive learning for improving knowledge tracing
Zhao, Yan
Ma, Huifang
Wang, Jing
He, Xiangchun
Chang, Liang
INFORMATION SCIENCES, 2024, 658

← 1 2 3 4 5 →