[R] Multilevel modeling with count variables

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Sat Mar 27 01:15:56 CET 2010


Your request might find better answers on the R-SIG-mixed-models
list ...

Anyway, some quick thoughts :

Le vendredi 26 mars 2010 à 15:20 -0800, dadrivr a écrit :
> By the way, my concern with lmer and glmer is that they don't produce
> p-values, 

The argumentation of D. Bates is convincing ... A large thread about
these issues exist on this (R-help) list archive.

I won't get in the (real) debate about the (dubious) value of hypothesis
testing, and I'm aware that there are domains where journal editors
*require* p-values. Sigh... This insistence seems to be weakening,
however.

>            and the techniques used to approximate the p-values with those
> functions (pvals.fnc, HPDinterval, mcmcsamp, etc.) only apply to Gaussian
> distributions.

(Probably ?) true for pvals.fnc, false for HPDinterval and mcmcsamp. You
can always generate a "sufficient" number of estimates and assess a
confidence interval and a p-value from such samples. See any good text
on use of simulation in statistics...

The fly in the ointment is that current versions of lmer seem to have
problems with mcmcsamp(). Bootstrap comes to mind, but might be
non-trivial to do efficiently, especially in a hierarchical/multilevel
context.

> Given that I would likely be working with quasi-poisson
> distributions, is there a good alternative mixed effects modeling approach

glmer, Bayesian modeling through BUGS (whose "probabilities"
interpretation is (quite) different from p-values).

> that would output significance values?

No. You'd have to compute them yourself ... if you're able to derive the
(possibly asymptotic) distribution of your test statistic under the
hypothesis you aim to refute. This nut might be harder to crack than it
appears...

Now, to come back to your original question, you might try 1) to use
glm/glmer to model your dependent variable as a (quasi-)Poisson
variable, and 2) use log transformations of the independent cout
variables ; a more general solution is given by the Box-Cox
transformation family : see the relevant function in MASS, which also
offers the logtrans family. ISTR that John Fox's car package offers a
function aiming at finding the optimal simultaneous transformations of a
dependent variable and its predictors. Other packages I'm not aware of
might offer other solutions ; in particular, I'd recommend to peek at
Frank Harrell's Hmisc and rms packages documentation, whose wealth I did
not yet seriously assess...

Bayesian modeling with BUGS would also allow you to try to fit any model
you might wish to test (provided that it can be written as a directed
acyclic graph and that the  distributions of you variables are either
from a "standard" family available in BUGS or that you are able to
express the (log-)density of the non-standard distribution you wish to
use). But, again, no p-values in sight. Would you settle for Bayes
factors between two models ? or DIC comparisons ?

HTH,

					Emmanuel Charpentier



More information about the R-help mailing list