[R] function logit() vs logistic regression

Rolf Turner rolf.turner at xtra.co.nz
Wed Oct 17 23:12:28 CEST 2012

On 18/10/12 07:58, swertie wrote:
> Hello!
> When I am analyzing proportion data, I usually apply logistic regression
> using a glm model with binomial family. For example:
> m <- glm( cbind("not realized", "realized") ~ v1 + v2 , family="binomial")
> However, sometimes I don't have the number of cases (realized, not
> realized), but only the proportion and thus cannot compute the binomial
> model. I just found out that the package car contains a function "logit"
> which allows for logit transformation. Would it be possible to transform the
> proportion data with this function and analyze the transformed data with a
> glm with family="gaussian"?
> Thank you very much.

Of course it's possible, but I doubt me an it maketh a great deal of sense.

(1) You don't need the car package to get a logit() function.  You can roll
your own in a couple of lines.

(2) I believe that the conventional wisdom is that the arcsin(sqrt(x)) 
should be used to transform proportion data to something which vaguely
resembles Gaussian data.  This transformation has the effect of "stabilizing
the variance".  (Others on the list may correct me on this point.)

(3) Whatever you try is not going to work very well if you have proportion
values that are close to 0 or to 1.

(4) Whatever you try is going to be a pretty shaganappi approximation.
The fact is that the variance of proportions does vary with the number of
cases.  A variance stabilizing transformation mitigates this effect but does
not eliminate it.  See fortune(111).


         Rolf Turner

More information about the R-help mailing list