[R] Wilcoxon versus glm

Mon Feb 25 18:50:22 CET 2002

On Mon, 25 Feb 2002, Dominik Grathwohl wrote:

> Hi all,
> running the following code:
> > n <- 25
> > y0 <- rpois(n, 0.04)
> > y1 <- rpois(n, 0.34)
> >
> > resp <- c(y0, y1)
> > group <- c(rep(0,n), rep(1,n))
> >
> > wilcox.test(y0, y1)
>
>         Wilcoxon rank sum test with continuity
> correction
>
> data:  y0 and y1
> W = 250, p-value = 0.02074
> alternative hypothesis: true mu is not equal to 0
>
> Warning message:
> Cannot compute exact p-value with ties in:
> wilcox.test.default(y0, y1)
> >
> > glm.M1 <- glm(resp ~ group, family=poisson())
> > summary(glm.M1)
>
> Call:
> glm(formula = resp ~ group, family = poisson())
>
> Deviance Residuals:
>       Min         1Q     Median         3Q
> Max
> -0.692820  -0.692820  -0.004968  -0.004968
> 2.227342
>
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -11.303     34.531  -0.327    0.743
> group          9.875     34.533   0.286    0.775
>
> (Dispersion parameter for poisson family taken to
> be 1)
>
>     Null deviance: 28.216  on 49  degrees of
> freedom
> Residual deviance: 19.899  on 48  degrees of
> freedom
> AIC: 34.512
>
> Number of Fisher Scoring iterations: 9
>
> I would interpretate this that the Wilcoxon detect
> a group difference, while glm not.

Exactly

>		I expected the
> beta for the group greater than zero.

and so it was. It was nearly 10

> Can somebody explain me such an difference of two
> methods of rejecting a hypothesis? Where am I
> wrong?

Well, there's a number of issues here

1/ There's no necessary reason why these two tests should agree as they
have different null hypotheses.  The Wilcoxon tests P(y1>y0)=1/2, the glm
compares two weighted means.

2/ The Wald tests done by glm() can perform badly when the difference
between the groups is very large as it is here. A coefficient of 9.87 is
infinite for practical purposes (remember it is a ratio of e^9.87, about
20000, in the means), so you were probably unlucky enough to get all zeros
in y0.  The MLE is then infinite.

If you used
	anova(glm.M1)
you would get a likelihood ratio test, which would behave better.

	-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._