[R] Testing continuous zero-inflated response

Achim Zeileis Achim.Zeileis at uibk.ac.at
Sun Jan 27 11:10:01 CET 2013


On Sun, 27 Jan 2013, Kay Cichini wrote:

> Thanks for the reply!
>
> Still, aren't there issues with 2-sample test vs y and excess zeroes
> (->many ties), like for Mann-Whitney-U tests?

If you use the (approximate) exact distribution, that is no problem.

The problem with the Wilcoxon/Mann-Whitney test and ties is only that the 
simple recursion formula for computing the exact distribution only works 
without ties. Thus, it's not the exact distribution that is wrong but only 
the standard algorithm for evaluating it.

Best,
Z

> Kind regards,
> Kay
>
>
> 2013/1/26 Achim Zeileis <Achim.Zeileis at uibk.ac.at>
>
>> On Fri, 25 Jan 2013, Kay Cichini wrote:
>>
>>  Hello,
>>>
>>> I'm searching for a test that applies to a dataset (N=36) with a
>>> continuous zero-inflated dependent variable
>>>
>>
>> In a regression setup, one can use a regression model with a response
>> censored at zero. survreg() in survival fits such models, tobit() in AER is
>> a convenience interface for this special case.
>>
>> If the effects of a regressor can be different for the probability of a
>> zero and the mean of the non-zero observations, then a two-part model can
>> be used. E.g. a probit fit (via glm) plus a truncated regression (via
>> truncreg in the package of the same name).
>>
>> However:
>>
>>
>>  and only one nominal grouping variable with 2 levels (balanced).
>>>
>>
>> In that case I would probably use no regression model but two-sample
>> permutation tests, e.g. via the "coin" package.
>>
>>
>>  In fact there are 4 response variables of this kind which I plan to test
>>> seperately - the amount of zeroes ranges from 75 to 97%..
>>>
>>
>> That means you have between one (!) and nine non-zero observations. In the
>> former case, it will be hard to model anything. And even in the latter case
>> it will be hard to investigate the probability of zero and the mean of the
>> non-zero observations separately.
>>
>> I would start out with a simple two-way table of (y > 0) vs group and
>> conduct Fisher's exact test.
>>
>> And then you might try also your favorite two sample test of y vs group,
>> preferably using the approximate exact distribution.
>>
>> Hope that helps,
>> Z
>>
>>  I searched the web and found several modelling approaches but have the
>>> feeling that they are overly complex for my very simple dataset.
>>>
>>> Thanks in advance for any help!
>>> Kay
>>>
>>> --
>>>
>>> Kay Cichini, MSc Biol
>>>
>>> Grubenweg 22, 6071 Aldrans
>>>
>>> Tel.: 0650 9359101
>>>
>>> E-Mail: kay.cichini at gmail.com
>>>
>>> Web: www.theBioBucket.blogspot.co.**at<http://www.theBioBucket.blogspot.co.at>
>>> <http://www.thebiobucket.**blogspot.co.at/<http://www.thebiobucket.blogspot.co.at/>
>>>> <http://www.**theBioBucket.blogspot.co.at<http://www.theBioBucket.blogspot.co.at>
>>>>
>>> --
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________**________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>
>
> -- 
>
> Kay Cichini, MSc Biol
>
> Grubenweg 22, 6071 Aldrans
>
> Tel.: 0650 9359101
>
> E-Mail: kay.cichini at gmail.com
>
> Web: www.theBioBucket.blogspot.co.at<http://www.thebiobucket.blogspot.co.at/><http://www.theBioBucket.blogspot.co.at>
> --
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list