Search This Blog

Tuesday, January 28, 2014

R (2) : Testing whether GLM overdispersion is significant


Is there a test to determine whether GLM overdispersion is significant?


I'm creating Poisson GLMs in R. To check for overdispersion I'm looking at the ratio of residual deviance to degrees of freedom provided by summary(model.name).
Is there a cutoff value or test for this ratio to be considered "significant?" I know that if it's >1 then the data are overdispersed, but if I have ratios relatively close to 1 [for example, one ratio of 1.7 (residual deviance = 25.48, df=15) and another of 1.3 (rd = 324, df = 253)], should I still switch to quasipoisson/negative binomial? I found here this test for significance: 1-pchisq(residual deviance,df), but I've only seen that once, which makes me nervous. I also read (I can't find the source) that a ratio < 1.5 is generally safe. Opinions?
Thanks!





   
In the R package AER you will find the function dispersiontest, which implements a Test for Overdispersion by Cameron & Trivedi (1990).

It follows a simple idea: In a Poisson model, the mean is E(Y)=μ and the variance is Var(Y)=μ as well. They are equal. The test simply tests this assumption as a null hypothesis against an alternative where Var(Y)=μ+cf(μ) where the constant c<0 means underdispersion and c>0 means overdispersion. The function f(.) is some monoton function (often linear or quadratic; the former is the default).The resulting test is equivalent to testing H0:c=0 vs. H1:c0 and the test statistic used is a t statistic which is asymptotically standard normal under the null.

Example:

R> library(AER)
R> data(RecreationDemand)
R> rd <- glm(trips ~ ., data = RecreationDemand, family = poisson)
R> dispersiontest(rd,trafo=1)

Overdispersion test

data:  rd
z = 2.4116, p-value = 0.007941
alternative hypothesis: true dispersion is greater than 0
sample estimates:
dispersion 
    5.5658 

Here we clearly see that there is evidence of overdispersion (c is estimated to be 5.57) which speaks quite strongly against the assumption of equidispersion (i.e. c=0).
Note that if you not use trafo=1, it will actually do a test of H0:c=1 vs. H1:c1 with c=c+1 which has of course the same result as the other test apart from the test statistic being shifted by one. The reason for this, though, is that the latter corresponds to the common parametrization in a quasi-Poisson model.





No comments:

Post a Comment