DENDROID: A text mining approach to analyzing and classifying code structures in Android malware families

被引:155
|
作者
Suarez-Tangil, Guillermo [1 ]
Tapiador, Juan E. [1 ]
Pens-Lopez, Pedro [1 ]
Blasco, Jorge [1 ]
机构
[1] Univ Carlos III Madrid, Dept Comp Sci, Comp Secur COSEC Lab, Madrid 28911, Spain
关键词
Malware analysis; Software similarity and classification; Text mining; Information retrieval; Smartphones; Android OS;
D O I
10.1016/j.eswa.2013.07.106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid proliferation of smartphones over the last few years has come hand in hand with and impressive growth in the number and sophistication of malicious apps targetting smartphone users. The availability of reuse-oriented development methodologies and automated malware production tools makes exceedingly easy to produce new specimens. As a result, market operators and malware analysts are increasingly overwhelmed by the amount of newly discovered samples that must be analyzed. This situation has stimulated research in intelligent instruments to automate parts of the malware analysis process. In this paper, we introduce DENDROID, a system based on text mining and information retrieval techniques for this task. Our approach is motivated by a statistical analysis of the code structures found in a dataset of ANDROID OS malware families, which reveals some parallelisms with classical problems in those domains. We then adapt the standard Vector Space Model and reformulate the modelling process followed in text mining applications. This enables us to measure similarity between malware samples, which is then used to automatically classify them into families. We also investigate the application of hierarchical clustering over the feature vectors obtained for each malware family. The resulting dendo-grams resemble the so-called phylogenetic trees for biological species, allowing us to conjecture about evolutionary relationships among families. Our experimental results suggest that the approach is remarkably accurate and deals efficiently with large databases of malware instances. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1104 / 1117
页数:14
相关论文
共 42 条
  • [21] Claim What You Need: A Text-Mining Approach on Android Permission Request Authorization
    Wei, Mingkui
    Gong, Xi
    Wang, Wenye
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [22] Analyzing Empowerment Processes Among Cancer Patients in an Online Community: A Text Mining Approach
    Verberne, Suzan
    Batenburg, Anika
    Sanders, Remco
    van Eenbergen, Mies
    Das, Enny
    Lambooij, Mattijs S.
    JMIR CANCER, 2019, 5 (01):
  • [23] Analyzing TripAdvisor reviews of wine tours: an approach based on text mining and sentiment analysis
    Barbierato, Elena
    Bernetti, Iacopo
    Capecchi, Irene
    INTERNATIONAL JOURNAL OF WINE BUSINESS RESEARCH, 2022, 34 (02) : 212 - 236
  • [24] Analyzing User Reviews on Digital Detox Apps: A Text Mining and Sentiment Analysis Approach
    Khan, Nazar Fatima
    Khan, Mohammed Naved
    JOURNAL OF CONSUMER BEHAVIOUR, 2025, 24 (01) : 392 - 404
  • [25] Analyzing Customer Experience Feedback Using Text Mining: A Linguistics-Based Approach
    Ordenes, Francisco Villarroel
    Theodoulidis, Babis
    Burton, Jamie
    Gruber, Thorsten
    Zaki, Mohamed
    JOURNAL OF SERVICE RESEARCH, 2014, 17 (03) : 278 - 295
  • [26] Mining the voice of employees: A text mining approach to identifying and analyzing job satisfaction factors from online employee reviews
    Jung, Yeonjae
    Suh, Yongmoo
    DECISION SUPPORT SYSTEMS, 2019, 123
  • [27] Python vs. R: a text mining approach for analyzing the research trends in scopus database
    Bhanot, Neeraj
    Singh, Harwinder
    Sharma, Divyansu
    Jain, Harshit
    Jain, Shreyansh
    arXiv, 2019,
  • [28] Mining structures of factual knowledge from text: An effort-light approach
    Ren, Xiang
    Han, Jiawei
    Synthesis Lectures on Data Mining and Knowledge Discovery, 2018, 10 (01): : 1 - 183
  • [29] Analyzing online reviews of foreign tourists to destination attractions in China: a novel text mining approach
    Li, Xiaokun
    Zhang, Yao
    Mei, Liyang
    ASIA PACIFIC JOURNAL OF TOURISM RESEARCH, 2023, 28 (07) : 647 - 666
  • [30] A Hybrid approach using topic modeling and class-association rule mining for text classification: The case of malware detection
    Kumar, B. Shravan
    Ravi, Vadlamani
    PROCEEDINGS OF 2018 IEEE 17TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC 2018), 2018, : 261 - 268