ConFunc: Enhanced Binary Function-Level Representation through Contrastive Learning

被引:0
|
作者
Li, Longfei [1 ]
Yin, Xiaokang [2 ]
Li, Xiao [2 ]
Zhu, Xiaoya [2 ]
Liu, Shengli [2 ]
机构
[1] Zhengzhou Univ, Zhengzhou, Peoples R China
[2] Informat Engn Univ, Zhengzhou, Peoples R China
关键词
binary code similarity detection; machine learning; contrastive learning; function embeddings;
D O I
10.1109/TrustCom60117.2023.00169
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binary code similarity detection (BCSD) has numerous applications, including malware detection, vulnerability search, plagiarism detection, and patch identification. Recent studies have demonstrated that with the rapid progress of machine learning (ML) techniques, various BCSD approaches based on machine learning have exhibited stronger performance than traditional methods. However, current ML-based BCSD approaches tend to ignore the issue of training samples, and most ML-based BCSD approaches are based on supervised learning, which is suffered from the labelling difficulties. To mitigate these issues, we propose ConFunc: a function-level binary code similarity detection framework based on contrastive learning. Performance evaluation shows that ConFunc enhances the Mean Reciprocal Rank (MRR) and Recall rates (Recall@1) of baseline models by fully harnessing the potential of the data. Additionally, ConFunc demonstrates stronger performance in scenarios with scarce data, achieving the baseline model's performance on the entire dataset using only 10% of the complete dataset. In real-world patch identification and vulnerability search tasks, ConFunc consistently outperforms other baseline models in MRR and Recall@10.
引用
收藏
页码:1241 / 1248
页数:8
相关论文
共 50 条
  • [21] Contrastive learning enhanced by graph neural networks for Universal Multivariate Time Series Representation
    College of Artificial Intelligence, Southwest University, Chongqing
    400715, China
    Inf. Syst.,
  • [22] Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning
    Moradinasab, Nazanin
    Sharma, Suchetha
    Bar-Yoseph, Ronen
    Radom-Aizik, Shlomit
    Bilchick, Kenneth C.
    Cooper, Dan M.
    Weltman, Arthur
    Brown, Donald E.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (03) : 1493 - 1519
  • [23] Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning
    Nazanin Moradinasab
    Suchetha Sharma
    Ronen Bar-Yoseph
    Shlomit Radom-Aizik
    Kenneth C. Bilchick
    Dan M. Cooper
    Arthur Weltman
    Donald E. Brown
    Data Mining and Knowledge Discovery, 2024, 38 : 1493 - 1519
  • [24] Patient-Level Contrastive Learning for Enhanced Biomarker Prediction in Retinal Imaging
    Kim, Hyeonmin
    Seo, Chanyang
    Cho, Yunnie
    Yoo, Tae Keun
    DATA ENGINEERING IN MEDICAL IMAGING, DEMI 2024, 2025, 15265 : 125 - 133
  • [25] Enhancing chemical reaction search through contrastive representation learning and human-in-the-loop
    Youngchun Kwon
    Hyunjeong Jeon
    Joonhyuk Choi
    Youn-Suk Choi
    Seokho Kang
    Journal of Cheminformatics, 17 (1)
  • [26] SynapseCLR: Uncovering features of synapses in primary visual cortex through contrastive representation learning
    Wilson, Alyssa
    Babadi, Mehrtash
    PATTERNS, 2023, 4 (04):
  • [27] Q-SupCon: Quantum-Enhanced Supervised Contrastive Learning Architecture within the Representation Learning Framework
    Don, Asitha kottahachchi kankanamge
    Khalil, Ibrahim
    ACM TRANSACTIONS ON QUANTUM COMPUTING, 2025, 6 (01):
  • [28] Dual-channel graph contrastive learning for self-supervised graph-level representation learning
    Luo, Zhenfei
    Dong, Yixiang
    Zheng, Qinghua
    Liu, Huan
    Luo, Minnan
    PATTERN RECOGNITION, 2023, 139
  • [29] Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning
    Liang, Meiyu
    Du, Junping
    Cao, Xiaowen
    Yu, Yang
    Lu, Kangkang
    Xue, Zhe
    Zhang, Min
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [30] Question-response representation with dual-level contrastive learning for improving knowledge tracing
    Zhao, Yan
    Ma, Huifang
    Wang, Jing
    He, Xiangchun
    Chang, Liang
    INFORMATION SCIENCES, 2024, 658