Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions

被引:15
|
作者
Canary, Jana D. [1 ]
Blizzard, Leigh [1 ]
Barry, Ronald P. [2 ]
Hosmer, David W. [3 ]
Quinn, Stephen J. [4 ]
机构
[1] Univ Tasmania, Menzies Res Inst Tasmania, Hobart, Tas 7000, Australia
[2] Univ Alaska Fairbanks, Dept Math & Stat, Fairbanks, AK 99775 USA
[3] Univ Massachusetts, Dept Publ Hlth, Amherst, MA 01003 USA
[4] Flinders Univ S Australia, Flinders Clin Effectiveness, Adelaide, SA 5001, Australia
基金
英国医学研究理事会;
关键词
Goodness-of-fit; Hosmer-Lemeshow; Noncanonical generalized linear models; Pigeon-Heyse; Tsiatis; LOGISTIC-REGRESSION-MODEL; SPARSE DATA; MORTALITY; TESTS;
D O I
10.1002/bimj.201400079
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (T-G), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2)) statistics can be applied directly. In a simulation study, T-G, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The T-G statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, T-G had more power than HL or J(2).
引用
收藏
页码:674 / 690
页数:17
相关论文
共 50 条