A mixed-integer exponential cone programming formulation for feature subset selection in logistic regression

被引:1
|
作者
Ahari, Sahand Asgharieh [1 ]
Kocuk, Burak [2 ]
机构
[1] Univ Groningen, Fac Econ & Business, Groningen, Netherlands
[2] Sabanci Univ, Ind Engn Program, Istanbul, Turkiye
关键词
Mixed-integer exponential cone programming; Machine learning; Sparse logistic regression; Classification; VARIABLE SELECTION;
D O I
10.1016/j.ejco.2023.100069
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Logistic regression is one of the widely-used classification tools to construct prediction models. For datasets with a large number of features, feature subset selection methods are considered to obtain accurate and interpretable prediction models, in which irrelevant and redundant features are removed. In this paper, we address the problem of feature subset selection in logistic regression using modern optimization techniques. To this end, we formulate this problem as a mixed-integer exponential cone program (MIEXP). To the best of our knowledge, this is the first time both nonlinear and discrete aspects of the underlying problem are fully considered within an exact optimization framework. We derive different versions of the MIEXP model by the means of regularization and goodness of fit measures including Akaike Information Criterion and Bayesian Information Criterion. Finally, we solve our MIEXP models using the solver MOSEK and evaluate the performance of our different versions over a set of toy examples and benchmark datasets. The results show that our approach is quite successful in obtaining accurate and interpretable pre-diction models compared to other methods from the literature.(c) 2023 The Author(s). Published by Elsevier Ltd on behalf of Association of European Operational Research Societies (EURO). This is an open access article under the CC BY-NC-ND license (http:// creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:31
相关论文
共 50 条
  • [21] Mixed-Integer Nonlinear Programming Formulation for Distribution Networks Reliability Optimization
    Heidari, Alireza
    Dong, Zhao Yang
    Zhang, Daming
    Siano, Pierluigi
    Aghaei, Jamshid
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (05) : 1952 - 1961
  • [22] Mean Squared Variance Portfolio: A Mixed-Integer Linear Programming Formulation
    Fernandez-Navarro, Francisco
    Martinez-Nieto, Luisa
    Carbonero-Ruz, Mariano
    Montero-Romero, Teresa
    MATHEMATICS, 2021, 9 (03) : 1 - 13
  • [23] Subset selection by Mallows Cp: A mixed integer programming approach
    Miyashiro, Ryuhei
    Takano, Yuichi
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (01) : 325 - 331
  • [24] Mixed integer second-order cone programming formulations for variable selection in linear regression
    Miyashiro, Ryuhei
    Takano, Yuichi
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 247 (03) : 721 - 731
  • [25] Comparison between mixed-integer and second order cone programming for autonomous overtaking
    Karlsson, Johan
    Murgovski, Nikolce
    Sjoberg, Jonas
    2018 EUROPEAN CONTROL CONFERENCE (ECC), 2018, : 387 - 392
  • [26] A compact formulation of a mixed-integer set
    Agra, Agostinho
    Constantino, Miguel
    OPTIMIZATION, 2010, 59 (05) : 729 - 745
  • [27] Mixed-integer quadratic programming is in NP
    Del Pia, Alberto
    Dey, Santanu S.
    Molinaro, Marco
    MATHEMATICAL PROGRAMMING, 2017, 162 (1-2) : 225 - 240
  • [28] Mixed-integer programming: A progress report
    Bixby, RE
    Fenelon, M
    Gu, ZH
    Rothberg, E
    Wunderling, R
    THE SHARPEST CUT: THE IMPACT OF MANFRED PADBERG AND HIS WORK, 2004, 4 : 309 - 325
  • [29] Mixed-integer nonlinear programming 2018
    Sahinidis, Nikolaos V.
    OPTIMIZATION AND ENGINEERING, 2019, 20 (02) : 301 - 306
  • [30] Lifting for conic mixed-integer programming
    Alper Atamtürk
    Vishnu Narayanan
    Mathematical Programming, 2011, 126 : 351 - 363