A Machine-Learning Algorithm with Disjunctive Model for Data-Driven Program Analysis

被引:18
|
作者
Jeon, Minseok [1 ]
Jeong, Sehun [1 ]
Cha, Sungdeok [1 ]
Oh, Hakjoo [1 ]
机构
[1] Korea Univ, Dept Comp Sci & Engn, 145 Anam Ro, Seoul 02841, South Korea
关键词
Data-driven program analysis; static analysis; context-sensitivity; flow-sensitivity; POINTS-TO ANALYSIS; CONTEXT-SENSITIVITY; STRATEGY; PRECISE; OCTAGON;
D O I
10.1145/3293607
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a new machine-learning algorithm with disjunctive model for data-driven program analysis. One major challenge in static program analysis is a substantial amount of manual effort required for tuning the analysis performance. Recently, data-driven program analysis has emerged to address this challenge by automatically adjusting the analysis based on data through a learning algorithm. Although this new approach has proven promising for various program analysis tasks, its effectiveness has been limited due to simple-minded learning models and algorithms that are unable to capture sophisticated, in particular disjunctive, program properties. To overcome this shortcoming, this article presents a new disjunctive model for data-driven program analysis as well as a learning algorithm to find the model parameters. Our model uses Boolean formulas over atomic features and therefore is able to express nonlinear combinations of program properties. A key technical challenge is to efficiently determine a set of good Boolean formulas, as brute-force search would simply be impractical. We present a stepwise and greedy algorithm that efficiently learns Boolean formulas. We show the effectiveness and generality of our algorithm with two static analyzers: context-sensitive points-to analysis for Java and flow-sensitive interval analysis for C. Experimental results show that our automated technique significantly improves the performance of the state-of-the-art techniques including ones hand-crafted by human experts.
引用
收藏
页数:41
相关论文
共 50 条
  • [31] Data-Driven Prediction of Janus/Core-Shell Morphology in Polymer Particles: A Machine-Learning Approach
    Esteki, Bahareh
    Masoomi, Mahmood
    Moosazadeh, Mohammad
    Yoo, ChangKyoo
    LANGMUIR, 2023, 39 (14) : 4943 - 4958
  • [32] Early prediction of clinical deterioration using data-driven machine-learning modeling of electronic health records
    Ruiz, Victor M.
    Goldsmith, Michael P.
    Shi, Lingyun
    Simpao, Allan F.
    Galvez, Jorge A.
    Naim, Maryam Y.
    Nadkarni, Vinay
    Gaynor, J. William
    Tsui, Fuchiang
    JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2022, 164 (01): : 211 - +
  • [33] Developing machine-learning regression model with Logical Analysis of Data (LAD)
    Khalifa, Ramy M.
    Yacout, Soumaya
    Bassetto, Samuel
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 151
  • [34] Probing non-Markovian quantum dynamics with data-driven analysis: Beyond "black-box" machine-learning models
    Luchnikov, I. A.
    Kiktenko, E. O.
    Gavreev, M. A.
    Ouerdane, H.
    Filippov, S. N.
    Fedorov, A. K.
    PHYSICAL REVIEW RESEARCH, 2022, 4 (04):
  • [35] A Data-Driven Influential Factor Analysis Method for Fly Ash-Based Geopolymer Using Optimized Machine-Learning Algorithms
    Ma, Guowei
    Cui, Aidi
    Huang, Yimiao
    Dong, Wei
    JOURNAL OF MATERIALS IN CIVIL ENGINEERING, 2022, 34 (07)
  • [36] Adapting Data-Driven Techniques to Improve Surrogate Machine Learning Model Performance
    Jones, Huw Rhys
    Popescu, Andrei C.
    Sulehman, Yusuf
    Mu, Tingting
    IEEE ACCESS, 2023, 11 : 23909 - 23925
  • [37] A data-driven energy performance gap prediction model using machine learning
    Yilmaz, Derya
    Tanyer, Ali Murat
    Toker, Irem Dikmen
    RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2023, 181
  • [38] On the Generalization Capability of a Data-Driven Turbulence Model by Field Inversion and Machine Learning
    Nishi, Yasunari
    Krumbein, Andreas
    Knopp, Tobias
    Probst, Axel
    Grabe, Cornelia
    AEROSPACE, 2024, 11 (07)
  • [39] An Intelligent Data-Driven Model to Secure Intravehicle Communications Based on Machine Learning
    Al-Saud, Mamdooh
    Eltamaly, Ali M.
    Mohamed, Mohamed A.
    Kavousi-Fard, Abdollah
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (06) : 5112 - 5119
  • [40] DATA-DRIVEN SYMBOL DETECTION VIA MODEL-BASED MACHINE LEARNING
    Farsad, Nariman
    Shlezinger, Nir
    Goldsmith, Andrea J.
    Eldar, Yonina C.
    2021 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2021, : 571 - 575