A study of learning likely data structure properties using machine learning models

被引：1

作者：

Usman, Muhammad ^{[1
]}

Wang, Wenxi ^{[1
]}

Wang, Kaiyuan ^{[1
]}

Yelen, Cagdas ^{[1
]}

Dini, Nima ^{[1
]}

Khurshid, Sarfraz ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER | 2020年 / 22卷 / 05期

基金：

美国国家科学基金会;

关键词：

Data structure invariants; Machine learning; Korat; Learnability;

D O I：

10.1007/s10009-020-00577-w

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Data structure properties are important for many testing and analysis tasks. For example, model checkers use these properties to find program faults. These properties are often written manually which can be error prone and lead to false alarms. This paper presents the results of controlled experiments performed using existing machine learning (ML) models on various data structures. These data structures are dynamic and reside on the program heap. We use ten data structure subjects and ten ML models to evaluate the learnability of data structure properties. The study reveals five key findings. One, most of the ML models perform well in learning data structure properties, but some of the ML models such as quadratic discriminant analysis and Gaussian naive Bayes are not suitable for learning data structure properties. Two, most of the ML models have high performance even when trained on just 1% of data samples. Three, certain data structure properties such as binary heap and red black tree are more learnable than others. Four, there are no significant differences between the learnability of varied-size (i.e., up to a certain size) and fixed-size data structures. Five, there can be significant differences in performance based on the encoding used. These findings show that using machine learning models to learn data structure properties is very promising. We believe that these properties, once learned, can be used to provide a run-time check to see whether a program state at a particular point satisfies the learned property. Learned properties can also be employed in the future to automate static and dynamic analysis, which would enhance software testing and verification techniques.

引用

页码：601 / 615

页数：15

共 50 条

[31] Data Integration using Machine Learning
Birgersson, Marcus
Hansson, Gustav
Franke, Ulrik
2016 IEEE 20TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING WORKSHOP (EDOCW), 2016, : 313 - 322
[32] Enhanced SHL Recognition Using Machine Learning and Deep Learning Models with Multi-source Data
Li, Mengyuan
Zhu, Jun
Zhang, Yuanyuan
Lu, Xiaoling
ADJUNCT PROCEEDINGS OF THE 2023 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING & THE 2023 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTING, UBICOMP/ISWC 2023 ADJUNCT, 2023, : 505 - 510
[33] SENTIMENTAL ANALYSIS OF COVID-19 TWITTER DATA USING DEEP LEARNING AND MACHINE LEARNING MODELS
Darad, Simran
Krishnan, Sridhar
INGENIUS-REVISTA DE CIENCIA Y TECNOLOGIA, 2023, (29): : 108 - 116
[34] Using Machine Learning to Predict Protein Structure from Spectral Data
Kinalwa, Myra
Doig, Andrew J.
Blanch, Ewan W.
XXII INTERNATIONAL CONFERENCE ON RAMAN SPECTROSCOPY, 2010, 1267 : 835 - 836
[35] A Comparative Study of Machine Learning and Deep Learning Models for Microplastic Classification using FTIR Spectra
Thar, Aeint Shune
Laitrakun, Seksan
Deepaisarn, Somrudee
Opaprakasit, Pakorn
Somnuake, Pattara
Athikulwongse, Krit
2023 18TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING, ISAI-NLP, 2023,
[36] Data structure for a fuzzy machine learning algorithm
Hong, TP
Lee, CY
PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 1315 - 1319
[37] Using machine learning to predict paperboard properties - a case study
Othen, Rosario
Cloppenburg, Frederik
Gries, Thomas
NORDIC PULP & PAPER RESEARCH JOURNAL, 2023, 38 (01) : 27 - 46
[38] Diagnostic performance of machine learning models using cell population data for the detection of sepsis: a comparative study
Aguirre, Urko
Urrechaga, Eloisa
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2023, 61 (02) : 356 - 365
[39] Reference evapotranspiration prediction using machine learning models: An empirical study from minimal climate data
Kumar, Bipin
Bisht, Himani
Rajput, Jitendra
Mishra, Anil Kumar
Tm, Kiran Kumara
Brahmanand, Pothula Srinivasa
AGRONOMY JOURNAL, 2024, 116 (03) : 956 - 972
[40] Solar Irradiance Prediction Using an Optimized Data Driven Machine Learning Models
Kumar, Mantosh
Namrata, Kumari
Kumar, Nishant
Saini, Gaurav
JOURNAL OF GRID COMPUTING, 2023, 21 (02)

← 1 2 3 4 5 →