[R] glm: modelling zeros as binary and non-zeroes as coming from a continuous distribution

Wed Mar 30 11:41:53 CEST 2011

Hello,

I'd like to implement a regression model for extremely zero-inflated
continuous data using a conditional approach, whereby zeroes are
modelled as coming from a binary distribution, while non-zero values
are modelled as log-normal.

So far, I've come across two solutions for this: one, in R, is
described in the book by Gelman & Hill
(http://www.amazon.com/dp/052168689X), where they just model zeros and
non-zeros separately and then bring them together by simulation. I can
do this, but it makes it difficult to assess the significance of
regression coefficients wrt to zero and each other.

Another solution I have been pointed at is in SAS:
http://listserv.uga.edu/cgi-bin/wa?A2=ind0805A&L=sas-l&P=R20779,
where they use NLMIXED (with only fixed effects) to specify their own
log-likelihood function.
I'm wondering if there's any way to do the same in R (lme can't deal
with this, as far as I'm aware).

Finally, I'm wondering whether anyone has experience with the COZIGAM
package - does it do something like this?

Many thanks,
Mikhail