Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models

被引:1
|
作者
Sauerbrei, Willi [1 ]
Kipruto, Edwin [1 ]
Balmford, James [1 ]
机构
[1] Univ Freiburg, Inst Med Biometry & Stat, Fac Med, Freiburg, Germany
关键词
Continuous variable; Fractional polynomial; Influential point; Model building; Sample size; Simulated data; CONTINUOUS PREDICTORS; REGRESSION; TRANSFORMATION; STABILITY; VARIABLES; SPLINES;
D O I
10.1186/s41512-023-00145-1
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background The multivariable fractional polynomial (MFP) approach combines variable selection using backward elimination with a function selection procedure (FSP) for fractional polynomial (FP) functions. It is a relatively simple approach which can be easily understood without advanced training in statistical modeling. For continuous variables, a closed test procedure is used to decide between no effect, linear, FP1, or FP2 functions. Influential points (IPs) and small sample sizes can both have a strong impact on a selected function and MFP model.Methods We used simulated data with six continuous and four categorical predictors to illustrate approaches which can help to identify IPs with an influence on function selection and the MFP model. Approaches use leave-one or two-out and two related techniques for a multivariable assessment. In eight subsamples, we also investigated the effects of sample size and model replicability, the latter by using three non-overlapping subsamples with the same sample size. For better illustration, a structured profile was used to provide an overview of all analyses conducted.Results The results showed that one or more IPs can drive the functions and models selected. In addition, with a small sample size, MFP was not able to detect some non-linear functions and the selected model differed substantially from the true underlying model. However, when the sample size was relatively large and regression diagnostics were carefully conducted, MFP selected functions or models that were similar to the underlying true model.Conclusions For smaller sample size, IPs and low power are important reasons that the MFP approach may not be able to identify underlying functional relationships for continuous variables and selected models might differ substantially from the true model. However, for larger sample sizes, a carefully conducted MFP analysis is often a suitable way to select a multivariable regression model which includes continuous variables. In such a case, MFP can be the preferred approach to derive a multivariable descriptive model.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Power and Sample Size for Fixed-Effects Inference in Reversible Linear Mixed Models
    Chi, Yueh-Yun
    Glueck, Deborah H.
    Muller, Keith E.
    AMERICAN STATISTICIAN, 2019, 73 (04): : 350 - 359
  • [42] Diagnostic Classification Models for Actionable Feedback in Education: Effects of Sample Size and Assessment Length
    Maas, Lientje
    Brinkhuis, Matthieu J. S.
    Kester, Liesbeth
    Wijngaards-de Meij, Leoniek
    FRONTIERS IN EDUCATION, 2022, 7
  • [43] Gear selectivity and sample size effects on growth curve selection in shark age and growth studies
    Thorson, James T.
    Simpfendorfer, Colin A.
    FISHERIES RESEARCH, 2009, 98 (1-3) : 75 - 84
  • [44] Sample Size and Power When Designing a Randomized Trial for the Estimation of Treatment, Selection, and Preference Effects
    Turner, Robin M.
    Walter, Stephen D.
    Macaskill, Petra
    McCaffery, Kirsten J.
    Irwig, Les
    MEDICAL DECISION MAKING, 2014, 34 (06) : 711 - 719
  • [45] Confidence interval width contours: Sample size planning for linear mixed-effects models
    Liu Yue
    Xu Lei
    Liu Hongyun
    Han Yuting
    You Xiaofeng
    Wan Zhilin
    ACTA PSYCHOLOGICA SINICA, 2024, 56 (01) : 124 - +
  • [46] Sample Size Estimation for Random-effects Models Balancing Precision and Feasibility in Panel Studies
    Weichenthal, Scott
    Baumgartner, Jill
    Hanley, James A.
    EPIDEMIOLOGY, 2017, 28 (06) : 817 - 826
  • [47] The Influence of Sample Size on Parameter Estimates in Three-Level Random-Effects Models
    Kerkhoff, Denise
    Nussbeck, Fridtjof W.
    FRONTIERS IN PSYCHOLOGY, 2019, 10
  • [48] Relative effects of sample size, detection probability, and study duration on estimation in integrated population models
    Ross, Beth E.
    Weegman, Mitch D.
    ECOLOGICAL APPLICATIONS, 2022, 32 (08)
  • [49] Assessing Sources of Error in Structural Equation Models: The Effects of Sample Size, Reliability, and Model Misspecification
    Bandalos, Deborah L.
    STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 1997, 4 (03) : 177 - 192
  • [50] Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease
    Nora Verplaetse
    Antoine Passemiers
    Adam Arany
    Yves Moreau
    Daniele Raimondi
    Genome Biology, 24