A study of learning likely data structure properties using machine learning models

被引:1
|
作者
Usman, Muhammad [1 ]
Wang, Wenxi [1 ]
Wang, Kaiyuan [1 ]
Yelen, Cagdas [1 ]
Dini, Nima [1 ]
Khurshid, Sarfraz [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Data structure invariants; Machine learning; Korat; Learnability;
D O I
10.1007/s10009-020-00577-w
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Data structure properties are important for many testing and analysis tasks. For example, model checkers use these properties to find program faults. These properties are often written manually which can be error prone and lead to false alarms. This paper presents the results of controlled experiments performed using existing machine learning (ML) models on various data structures. These data structures are dynamic and reside on the program heap. We use ten data structure subjects and ten ML models to evaluate the learnability of data structure properties. The study reveals five key findings. One, most of the ML models perform well in learning data structure properties, but some of the ML models such as quadratic discriminant analysis and Gaussian naive Bayes are not suitable for learning data structure properties. Two, most of the ML models have high performance even when trained on just 1% of data samples. Three, certain data structure properties such as binary heap and red black tree are more learnable than others. Four, there are no significant differences between the learnability of varied-size (i.e., up to a certain size) and fixed-size data structures. Five, there can be significant differences in performance based on the encoding used. These findings show that using machine learning models to learn data structure properties is very promising. We believe that these properties, once learned, can be used to provide a run-time check to see whether a program state at a particular point satisfies the learned property. Learned properties can also be employed in the future to automate static and dynamic analysis, which would enhance software testing and verification techniques.
引用
收藏
页码:601 / 615
页数:15
相关论文
共 50 条
  • [31] Data Integration using Machine Learning
    Birgersson, Marcus
    Hansson, Gustav
    Franke, Ulrik
    2016 IEEE 20TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING WORKSHOP (EDOCW), 2016, : 313 - 322
  • [32] Enhanced SHL Recognition Using Machine Learning and Deep Learning Models with Multi-source Data
    Li, Mengyuan
    Zhu, Jun
    Zhang, Yuanyuan
    Lu, Xiaoling
    ADJUNCT PROCEEDINGS OF THE 2023 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING & THE 2023 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTING, UBICOMP/ISWC 2023 ADJUNCT, 2023, : 505 - 510
  • [33] SENTIMENTAL ANALYSIS OF COVID-19 TWITTER DATA USING DEEP LEARNING AND MACHINE LEARNING MODELS
    Darad, Simran
    Krishnan, Sridhar
    INGENIUS-REVISTA DE CIENCIA Y TECNOLOGIA, 2023, (29): : 108 - 116
  • [34] Using Machine Learning to Predict Protein Structure from Spectral Data
    Kinalwa, Myra
    Doig, Andrew J.
    Blanch, Ewan W.
    XXII INTERNATIONAL CONFERENCE ON RAMAN SPECTROSCOPY, 2010, 1267 : 835 - 836
  • [35] A Comparative Study of Machine Learning and Deep Learning Models for Microplastic Classification using FTIR Spectra
    Thar, Aeint Shune
    Laitrakun, Seksan
    Deepaisarn, Somrudee
    Opaprakasit, Pakorn
    Somnuake, Pattara
    Athikulwongse, Krit
    2023 18TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING, ISAI-NLP, 2023,
  • [36] Data structure for a fuzzy machine learning algorithm
    Hong, TP
    Lee, CY
    PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 1315 - 1319
  • [37] Using machine learning to predict paperboard properties - a case study
    Othen, Rosario
    Cloppenburg, Frederik
    Gries, Thomas
    NORDIC PULP & PAPER RESEARCH JOURNAL, 2023, 38 (01) : 27 - 46
  • [38] Diagnostic performance of machine learning models using cell population data for the detection of sepsis: a comparative study
    Aguirre, Urko
    Urrechaga, Eloisa
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2023, 61 (02) : 356 - 365
  • [39] Reference evapotranspiration prediction using machine learning models: An empirical study from minimal climate data
    Kumar, Bipin
    Bisht, Himani
    Rajput, Jitendra
    Mishra, Anil Kumar
    Tm, Kiran Kumara
    Brahmanand, Pothula Srinivasa
    AGRONOMY JOURNAL, 2024, 116 (03) : 956 - 972
  • [40] Solar Irradiance Prediction Using an Optimized Data Driven Machine Learning Models
    Kumar, Mantosh
    Namrata, Kumari
    Kumar, Nishant
    Saini, Gaurav
    JOURNAL OF GRID COMPUTING, 2023, 21 (02)