[R] lmer and a response that is a proportion

Cameron Gillies cgillies at ualberta.ca
Mon Dec 4 05:38:31 CET 2006


Hello Simon and John,

I'm afraid I need to include random effects, both a random intercept and
possibly random coefficients and it doesn't look like betareg can do that.

John, the data is spread along the range of 0 to 1 with most values closer
to 1, so it does transform well using the logit transformation.  I was
trying to avoid that though because I was not sure what impact the
transformation would have on the random effects or interpretation of the
coefficients.  

Thanks again!
Cam

On 12/3/06 7:46 PM, "Simon Blomberg" <blomsp at ozemail.com.au> wrote:

> Would beta regression solve your problem? (package betareg)
> 
> Simon.
> 
> John Fox wrote:
>> Dear Cameron,
>> 
>> Given your description, I thought that this might be the case.
>> 
>> I'd first examine the distribution of the response variable to see what it
>> looks like. If the values don't push the boundaries of 0 and 1, and their
>> distribution is unimodal and reasonably symmetric, I'd consider analyzing
>> them directly using normally distributed errors. If the values do stack up
>> near 0, 1, or both, I'd consider a transformation, or perhaps a different
>> family (depending on the pattern); in particular, if they stack up near both
>> 0 and 1, a logit or similar transformation could help. Finally, if you have
>> many values of 0, 1, or both, then a transformation isn't promising (and,
>> indeed, the logit wouldn't be defined for these values). In any event, I'd
>> check diagnostics after a preliminary fit.
>> 
>> I hope this helps,
>>  John
>> 
>> --------------------------------
>> John Fox
>> Department of Sociology
>> McMaster University
>> Hamilton, Ontario
>> Canada L8S 4M4
>> 905-525-9140x23604
>> http://socserv.mcmaster.ca/jfox
>> --------------------------------
>> 
>>   
>>> -----Original Message-----
>>> From: Cameron Gillies [mailto:cgillies at ualberta.ca]
>>> Sent: Sunday, December 03, 2006 6:31 PM
>>> To: Prof Brian Ripley; John Fox
>>> Cc: r-help at stat.math.ethz.ch
>>> Subject: Re: [R] lmer and a response that is a proportion
>>> 
>>> Dear Brian and John,
>>> 
>>> Thanks for your insight.  I'll clarify a couple of things
>>> incase it changes your advice.
>>> 
>>> My response is a ratio of two measures taken during a bird's
>>> path, which varies from 0  to 1, so I cannot convert it
>>> columns of the number of successes.  It has to be reported as
>>> the proportion.  I could logit transform it to make it
>>> normal, but I am trying to avoid that so I can analyze it directly.
>>> 
>>> The subjects are individual birds and I have a range of
>>> sample sizes from each bird (from 8 to >200, average of about
>>> 75 measurements/bird).
>>> 
>>> Thanks!
>>> Cam
>>> 
>>> 
>>> On 12/3/06 3:47 PM, "Prof Brian Ripley" <ripley at stats.ox.ac.uk> wrote:
>>> 
>>>     
>>>> On Sun, 3 Dec 2006, John Fox wrote:
>>>> 
>>>>       
>>>>> Dear Cameron,
>>>>> 
>>>>>         
>>>>>> -----Original Message-----
>>>>>> From: r-help-bounces at stat.math.ethz.ch
>>>>>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Cameron
>>>>>> Gillies
>>>>>> Sent: Sunday, December 03, 2006 1:58 PM
>>>>>> To: r-help at stat.math.ethz.ch
>>>>>> Subject: [R] lmer and a response that is a proportion
>>>>>> 
>>>>>> Greetings all,
>>>>>> 
>>>>>> I am using lmer (lme4 package) to analyze data where the
>>>>>>           
>>> response is 
>>>     
>>>>>> a proportion (0 to 1).  It appears to work, but I am wondering if
>>>>>> the analysis is treating the response appropriately -
>>>>>>           
>>> i.e. can lmer 
>>>     
>>>>>> do this?
>>>>>> 
>>>>>>           
>>>>> As far as I know, you can specify the response as a proportion, in
>>>>> which case the binomial counts would be given via the weights
>>>>> argument -- at least that's how it's done in glm(). An alternative
>>>>> that should be equivalent is to specify a two-column matrix with
>>>>> counts of "successes" and "failures" as the response.
>>>>>         
>>> Simply giving 
>>>     
>>>>> the proportion of successes without the counts wouldn't be
>>>>>         
>>> appropriate.
>>>     
>>>>>> I have used both family=binomial and quasibinomial - is one more
>>>>>> appropriate when the response is a proportion?  The coefficient
>>>>>> estimates are identical, but the standard errors are larger with
>>>>>> family=binomial.
>>>>>> 
>>>>>>           
>>>>> The difference is that in the binomial family the
>>>>>         
>>> dispersion is fixed
>>>     
>>>>> to 1, while in the quasibinomial family it is estimated as a free
>>>>> parameter. If the standard errors are larger with family=binomial,
>>>>> then that suggests that the data are underdispersed
>>>>>         
>>> (relative to the
>>>     
>>>>> binomial); if the difference is substantial -- the factor
>>>>>         
>>> is just the 
>>>     
>>>>> square root of the estimated dispersion -- then the
>>>>>         
>>> binomial model is
>>>     
>>>>> probably not appropriate for the data.
>>>>>         
>>>> John's last deduction is appropriate to a GLM, but not
>>>>       
>>> necessarily to 
>>>     
>>>> a GLMM. I don't have detailed experience with lmer for
>>>>       
>>> binomial, but I
>>>     
>>>> do for various other fitting routines for GLMM.  Remember
>>>>       
>>> there are at 
>>>     
>>>> least two sources of randomness in a GLMM, and let us keep
>>>>       
>>> it simple 
>>>     
>>>> and have just a subject effect and a measurement error.  Then if
>>>> over-dispersion is happening within subjects, forcing the binomial
>>>> dispersion (at the measurement level) to 1 tends to increase the
>>>> estimate of the subject-level variance component to
>>>>       
>>> compensate, and in
>>>     
>>>> turn increase some of the standard errors.
>>>> 
>>>> (Please note the 'tends' in that para, as the details of
>>>>       
>>> the design do 
>>>     
>>>> matter.  For cognescenti, think about plot and sub-plot
>>>>       
>>> treatments in 
>>>     
>>>> a split-plot design.)
>>>>       
>> 
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>>   
>




More information about the R-help mailing list