Predictive performance of the binary logit model in unbalanced samples

被引:114
|
作者
Cramer, JS [1 ]
机构
[1] Tinbergen Inst, Amsterdam, Netherlands
关键词
goodness of fit; logistic regression; predicted probabilities; unequal sample proportions;
D O I
10.1111/1467-9884.00173
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In a binary logit analysis with unequal sample frequencies of the two outcomes the less frequent outcome always has tower estimated prediction probabilities than the other outcome. This effect is unavoidable, and its extent varies inversely with the fit of the model, as given by a new measure that follows naturally from the argument. Unbalanced samples with a poor fit are typical for survey analyses in the social sciences and epidemiology, and there the difference in prediction probabilities is most acute. It affects two common diagnostics: the within-sample 'percentage correctly predicted' and the identification of outliers. Partial remedies are suggested.
引用
收藏
页码:85 / 94
页数:10
相关论文
共 50 条
  • [11] About Joining Explanation Factor Levels in the Binary Logit Model
    Ponsot Balaguer, Ernesto
    Sinha, Surendra
    Goitia, Arnaldo
    REVISTA COLOMBIANA DE ESTADISTICA, 2009, 32 (02): : 157 - 187
  • [13] The random subspace binary logit (RSBL) model for bankruptcy prediction
    Li, Hui
    Lee, Young-Chan
    Zhou, Yan-Chun
    Sun, Jie
    KNOWLEDGE-BASED SYSTEMS, 2011, 24 (08) : 1380 - 1388
  • [14] Predictive logit model of trip mode with fuzzy attribute variables
    Zhu, S.-Y. (zhusy2001@163.com), 1600, Chang'an University (13):
  • [15] A note on combining on-site samples and supplemental samples in a logit model of recreation demand
    Caudill, Steven B.
    Acharya, Ram N.
    Hite, Diane
    APPLIED ECONOMICS LETTERS, 2009, 16 (13) : 1319 - 1322
  • [17] Review of Bagging and Boosting Classification Performance on Unbalanced Binary Classification
    Singhal, Yash
    Jain, Ayushi
    Batra, Shrey
    Varshney, Yash
    Rathi, Megha
    PROCEEDINGS OF THE 2018 IEEE 8TH INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC 2018), 2018, : 338 - 343
  • [18] Pushing the Limits The Performance of Maximum Likelihood and Bayesian Estimation With Small and Unbalanced Samples in a Latent Growth Model
    Zondervan-Zwijnenburg, Marielle
    Depaoli, Sarah
    Peeters, Margot
    van de Schoot, Rens
    METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2019, 15 (01) : 31 - 43
  • [19] A PARAMETER-DRIVEN LOGIT REGRESSION MODEL FOR BINARY TIME SERIES
    Wu, Rongning
    Cui, Yunwei
    JOURNAL OF TIME SERIES ANALYSIS, 2014, 35 (05) : 462 - 477
  • [20] Model Predictive Control for Distributed Microgrid System with Unbalanced Loads
    Pu, Yunfei
    Wu, Jing
    Li, Shaoyuan
    2017 13TH IEEE CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2017, : 1622 - 1627