Using observation-level random effects to model overdispersion in count data in ecology and evolution

被引:831
|
作者
Harrison, Xavier A. [1 ]
机构
[1] Zool Soc London, Inst Zool, London NW1 4RY, England
来源
PEERJ | 2014年 / 2卷
关键词
Observation-level random effect; Explained variance; r-squared; Poisson-lognormal models; Quasi-Poisson; Generalized linear mixed models; INFERENCE;
D O I
10.7717/peerj.616
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Overdispersion is common in models of count data in ecology and evolutionary biology, and can occur due to missing covariates, non-independent (aggregated) data, or an excess frequency of zeroes (zero-inflation). Accounting for overdispersion in such models is vital, as failing to do so can lead to biased parameter estimates, and false conclusions regarding hypotheses of interest. Observation-level random effects (OLRE), where each data point receives a unique level of a random effect that models the extra-Poisson variation present in the data, are commonly employed to cope with overdispersion in count data. However studies investigating the efficacy of observation-level random effects as a means to deal with overdispersion are scarce. Here I use simulations to show that in cases where overdispersion is caused by random extra-Poisson noise, or aggregation in the count data, observation-level random effects yield more accurate parameter estimates compared to when overdispersion is simply ignored. Conversely, OLRE fail to reduce bias in zero-inflated data, and in some cases increase bias at high levels of overdispersion. There was a positive relationship between the magnitude of overdispersion and the degree of bias in parameter estimates. Critically, the simulations reveal that failing to account for overdispersion in mixed models can erroneously inflate measures of explained variance (r(2)), which may lead to researchers overestimating the predictive power of variables of interest. This work suggests use of observation-level random effects provides a simple and robust means to account for overdispersion in count data, but also that their ability to minimise bias is not uniform across all types of overdispersion and must be applied judiciously.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution
    Harrison, Xavier A.
    PEERJ, 2015, 3
  • [2] Modelling count data with overdispersion and spatial effects
    Susanne Gschlößl
    Claudia Czado
    Statistical Papers, 2008, 49
  • [3] Modelling count data with overdispersion and spatial effects
    Gschloessl, Susanne
    Czado, Claudia
    STATISTICAL PAPERS, 2008, 49 (03) : 531 - 552
  • [4] Using the negative binomial distribution to model overdispersion in ecological count data
    Linden, Andreas
    Mantyniemi, Samu
    ECOLOGY, 2011, 92 (07) : 1414 - 1421
  • [5] Observation-Level and Parametric Interaction for High-Dimensional Data Analysis
    Self, Jessica Zeitz
    Dowling, Michelle
    Wenskovitch, John
    Crandell, Ian
    Wang, Ming
    House, Leanna
    Leman, Scotland
    North, Chris
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2018, 8 (02)
  • [6] Data-driven segmentation of observation-level logistic regression models
    Choi, Yunjin
    Park, No-Wook
    Lee, Woojoo
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2025,
  • [7] Tests for serial correlation and overdispersion in a count data regression model
    Johansson, P
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1995, 53 (3-4) : 153 - 164
  • [8] A micro-level claim count model with overdispersion and reporting delays
    Avanzi, Benjamin
    Wong, Bernard
    Yang, Xinda
    INSURANCE MATHEMATICS & ECONOMICS, 2016, 71 : 1 - 14
  • [9] Empirical Hierarchical Modelling for Count Data using the Spatial Random Effects Model
    Sengupta, Aritra
    Cressie, Noel
    SPATIAL ECONOMIC ANALYSIS, 2013, 8 (03) : 389 - 418
  • [10] Random Forests in Count Data Modelling: An Analysis of the Influence of Data Features and Overdispersion on Regression Performance
    Mushagalusa, Ciza Arsene
    Fandohan, Adande Belarmain
    Glele Kakai, Romain
    JOURNAL OF PROBABILITY AND STATISTICS, 2022, 2022