[R] transforming data glm

Ben Bolker bolker at ufl.edu
Mon Aug 24 17:58:41 CEST 2009




Mcdonald, Grant wrote:
> 
> Dear sir,
> 
> I am fitting a glm with default identity link:
> 
> 
> 
> model<-glm(timetoacceptsecs~maleage*maletub*relweight*malemobtrue*femmobtrue)
> 
> the model is overdisperesed and plot model shows a low level of linearity
> of the residuals. 
> 
>   >> I don't see how the model can be *over*dispersed unless you are using
> a family
>   >>  with a fixed scale parameter (binomial/Poisson/etc.) ?
> 
>  The overdispersion and linearity of residulas on the normal Q-Q plot is
> corrected well by using:
> 
> 
> 
> model<-glm(log(timetoacceptsecs)~maleage*maletub*relweight*malemobtrue*femmobtrue))
> 
> Boxcox of my model also suggests that the log transformation is what i
> should do.
> 
> I ask how i am able to do this by changing the link function or error
> family of my glm and not diretly taking the log of the response variable.  
> 
> For instance:
> model<-glm(log(timetoacceptsecs)~maleage*maletub*relweight*malemobtrue*femmobtrue,
> family=poisson))
> does not improve my model in terms of overdispersion etc as much as taking
> the log.
> 
> 

I don't see why you are using a Poisson family for data that are
(apparently, based on their name
"time to accept in seconds") -- unless you have some particular reason to
believe that in your
system they should follow a Poisson (it seems unlikely -- some form of
waiting time distribution
[exponential, gamma, Weibull) seems more plausible))

   the difference between

  glm(y~x,family=gaussian(link="log"))

and

  glm(log(y)~x, family=gaussian(link="identity"))

(which is essentially equivalent to glm(log(y)~x) or lm(log(y)~x))

  is in whether the error is assumed to be normal with a constant
variance on the original scale (the first method) or on the
log-transformed scale (the second method)

  note that you have to be careful about model comparisons between
continuous data transformed to different scales.

  Bottom line: I don't see what's wrong with your second model.  Why not
just use it?

-- 
View this message in context: http://www.nabble.com/transforming-data-glm-tp25115147p25118604.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list