# [R] A logit question?

Renaud Lancelot lancelot at sentoo.sn
Mon May 6 15:12:40 CEST 2002

```It might be possible to use frequency data, if you mean proportions.
There are two ways: either the response is a matrix (successes vs
failures) or a vector of proportions (successes / (successes + failures)
). In the latter case, you will have to use a "weight" argument, with
the weight = the denominator of the proportion.

See help on glm:

?glm
[snip]
Details:

A typical predictor has the form `response ~ terms' where
`response' is the (numeric) response vector and `terms' is a
series of terms which specifies a linear predictor for `response'.
For `binomial' models the response can also be specified as a
`factor' (when the first level denotes failure and all others
success) or as a two-column matrix with the columns giving the
numbers of successes and failures.  A terms specification of the
form `first + second' indicates all the terms in `first' together
with all the terms in `second' with duplicates removed.
[snip]

Hope this helps,

Renaud

Achim Zeileis wrote:
>
> Mäkinen Jussi wrote:
> >
> > I have got few answers which has pointed out that usually logit-model is for
> > a binary response (dependent) variable. And this was a part of my (obviously
> > badly written) question: is it possible to regress frequency data (e.g. not
> > binary response) with glm(y~x, family=binomial(link=logit))?
> >
> > glm-help says:
> >
> > <snip>...For `binomial' models the response can also be specified as a
> > `factor' (when the first level denotes failure and all others success) or as
> > a two-column matrix with the columns giving the numbers of successes and
> > failures....<snip>
> >
> > which led me think that it can handle frequency data (grouped data) as well.
> >
> > But that should give the same result as transforming response and running
> > regular OLS?
>
> No, glm() gives you the ML estimate for the regression coefficients.
> Only for the gaussian family ML and OLS are the same.
> Z
>
> > Jussi
> >
> > Mäkinen Jussi wrote:
> >
> > >Hello dear r-gurus!
> > >
> > >I have a question about the logit-model. I think I have misunderstood
> > >something and I'm trying to find a bug from my code or even better from my
> > >head. Any help is appreciated.
> > >
> > >The question is shortly: why I'm not having same coefficients from the
> > >logit-regression when using a link-function and an explicite transformation
> > >of the dependent. Below some details.
> > >
> > >I'm not very familiar with the concept. As far as I have understood it's
> > all
> > >about transformation of the dependent variable if one have frequency data
> > >(grouped data, instead of raw binaries):
> > >
> > >ln(^p(i)/(1-^p(i)) = c + b_1(X_1) +...+ b_k(X_k) + e(i).
> > >
> > >where ^p(i) is (estimated) frequency of incident (happened/all = n(i)/N), i
> > >is index of observation, c and b_. are coefficients (objects of the
> > >estimation), X_. are the explanatory variables and e is residual. So a
> > >linear regression.
> > >
> > >And some testing:
> > >
> > >
> > >>y <- runif(100)
> > >>
> > Should you use a binomial (0,1) response variable?
> >
> > best regards!
> >
> > >>
> > >>X <- rnorm(100)
> > >>
> >
> > >>
> > >
> > >Call:  glm(formula = y ~ X, family = binomial(link = logit))
> > >
> > >Coefficients:
> > >(Intercept)            X
> > >   -0.00956      0.10760
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> > >Null Deviance:      43.83
> > >Residual Deviance: 43.49        AIC: 142.3
> > >Warning message:
> > >non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >
> > >
> > >### OR
> > >
> > >>glm(cbind(y, 1-y)~ X, family=binomial(link=logit))    ### ?glm
> > >>
> > >
> > >Call:  glm(formula = cbind(y, 1 - y) ~ X, family = binomial(link = logit))
> > >
> > >Coefficients:
> > >(Intercept)            X
> > >   -0.00956      0.10760
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> > >Null Deviance:      43.83
> > >Residual Deviance: 43.49        AIC: 142.3
> > >Warning message:
> > >non-integer counts in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >
> > >
> > >### BUT
> > >
> > >>glm(y.logit.transformation(y)~ X)
> > >>
> > >
> > >Call:  glm(formula = y.logit.transformation(y) ~ X)
> > >
> > >Coefficients:
> > >(Intercept)            X
> > >     0.1233       0.1023
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> > >Null Deviance:      465.6
> > >Residual Deviance: 464.4        AIC: 443.3
> > >
> > >
> > >### OR
> > >
> > >>lm(y.logit.transformation(y)~ X)
> > >>
> > >
> > >Call:
> > >lm(formula = y.logit.transformation(y) ~ X)
> > >
> > >Coefficients:
> > >(Intercept)            X
> > >     0.1233       0.1023
> > >
> > >
> > >It's close (AIC and residual deviance is different due transformation) but
> > I
> > >think that relationship should be exact? Or is it just calculation
> > >inaccurance? Or is there some hidden reason (to me..)? Is it legimitate to
> > >use frequency regression when using R for the logit-model (alternatives?).
> > >
> > >I would like to know what does exactly mean the warning message:
> > >non-integer counts in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >For the dependent transformation:
> > >
> > >"y.logit.transformation" <- function(y)
> > >{
> > >       y.trans <- log(y/(1-y))
> > >       y.trans
> > >}
> > >
> > >version
> > >
> > >platform i386-pc-mingw32
> > >arch     i386
> > >os       mingw32
> > >system   i386, mingw32
> > >status
> > >major    1
> > >minor    5.0
> > >year     2002
> > >month    04
> > >day      29
> > >language R
> > >
> > >OS is Windows2000.
> > >
> > >Thank you for any help.
> > >
> > >
> > >Jussi Mäkinen
> > >Analyst
> > >State Treasury, Finland
> > >phone:  +358-9-7725 616
> > >mobile: +358-50-5958 710
> > >www.statetreasury.fi
> > >mailto:jussi.makinen at valtiokonttori.fi
> > >
> > >
> > >
> > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > .-.-
> > >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > >Send "info", "help", or "[un]subscribe"
> > >(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> > >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> > ._._
> > >
> >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

--
Dr Renaud Lancelot, vétérinaire
Programme Productions Animales

ISRA-LNERV                      tel    (221) 832 49 02
BP 2057 Dakar-Hann              fax    (221) 821 18 79 (CIRAD)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```