A study of learning likely data structure properties using machine learning models

被引:1
|
作者
Usman, Muhammad [1 ]
Wang, Wenxi [1 ]
Wang, Kaiyuan [1 ]
Yelen, Cagdas [1 ]
Dini, Nima [1 ]
Khurshid, Sarfraz [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Data structure invariants; Machine learning; Korat; Learnability;
D O I
10.1007/s10009-020-00577-w
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Data structure properties are important for many testing and analysis tasks. For example, model checkers use these properties to find program faults. These properties are often written manually which can be error prone and lead to false alarms. This paper presents the results of controlled experiments performed using existing machine learning (ML) models on various data structures. These data structures are dynamic and reside on the program heap. We use ten data structure subjects and ten ML models to evaluate the learnability of data structure properties. The study reveals five key findings. One, most of the ML models perform well in learning data structure properties, but some of the ML models such as quadratic discriminant analysis and Gaussian naive Bayes are not suitable for learning data structure properties. Two, most of the ML models have high performance even when trained on just 1% of data samples. Three, certain data structure properties such as binary heap and red black tree are more learnable than others. Four, there are no significant differences between the learnability of varied-size (i.e., up to a certain size) and fixed-size data structures. Five, there can be significant differences in performance based on the encoding used. These findings show that using machine learning models to learn data structure properties is very promising. We believe that these properties, once learned, can be used to provide a run-time check to see whether a program state at a particular point satisfies the learned property. Learned properties can also be employed in the future to automate static and dynamic analysis, which would enhance software testing and verification techniques.
引用
收藏
页码:601 / 615
页数:15
相关论文
共 50 条
  • [41] Big Data Analytics for Predictive System Maintenance Using Machine Learning Models
    Ngwa, Pius
    Ngaruye, Innocent
    ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2023, 15 (01N02)
  • [42] Analysis and Prediction of COVID-19 Data using Machine Learning Models
    Chrin, Richvichanak
    Wang, Sujing
    ACM International Conference Proceeding Series, 2021, : 296 - 301
  • [43] Classification of Thyroid Using Data Mining Models: A Comparison with Machine Learning Algorithm
    Balasree K.
    Dharmarajan K.
    SN Computer Science, 5 (3)
  • [44] Unsupervised Machine Learning for Developing Personalised Behaviour Models Using Activity Data
    Fiorini, Laura
    Cavallo, Filippo
    Dario, Paolo
    Eavis, Alexandra
    Caleb-Solly, Praminda
    SENSORS, 2017, 17 (05)
  • [45] Development and validation of HBV surveillance models using big data and machine learning
    Dong, Weinan
    Da Roza, Cecilia Clara
    Cheng, Dandan
    Zhang, Dahao
    Xiang, Yuling
    Seto, Wai Kay
    Wong, William C. W.
    ANNALS OF MEDICINE, 2024, 56 (01)
  • [46] Analyzing Longitudinal Data Using Machine Learning with Mixed-Effects Models
    Yigit, Pakize
    Ahmed, Syed Ejaz
    EIGHTEENTH INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT, ICMSEM 2024, 2024, 215 : 633 - 646
  • [47] Machine and Deep Learning Models for Stress Detection Using Multimodal Physiological Data
    Abdelfattah, Eman
    Joshi, Shreehar
    Tiwari, Shreekar
    IEEE ACCESS, 2025, 13 : 4597 - 4608
  • [48] Solar Irradiance Prediction Using an Optimized Data Driven Machine Learning Models
    Mantosh Kumar
    Kumari Namrata
    Nishant Kumar
    Gaurav Saini
    Journal of Grid Computing, 2023, 21
  • [49] Establishment of prognostic models of adrenocortical carcinoma using machine learning and big data
    Tang, Jun
    Fang, Yu
    Xu, Zhe
    FRONTIERS IN SURGERY, 2023, 9
  • [50] Proposal and Implementation of Machine Learning Models for Stock Markets Using Web Data
    Machado, Eduardo Jabbur
    Machado Pereira, Adriano Cesar
    WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2018, : 61 - 64