[R] logistic regression weights problem

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Apr 13 18:42:25 CEST 2005


On Wed, 13 Apr 2005, Federico Calboli wrote:

> I have a problem with weighted logistic regression. I have a number of
> SNPs  and a case/control scenario, but not all genotypes are as
> "guaranteed" as others, so I am using weights to downsample the
> importance of individuals whose genotype has been heavily "inferred".
>
> My data is quite big, but with a dummy example:
>
>> status <- c(1,1,1,0,0)
>> SNPs <- matrix( c(1,0,1,0,0,0,0,1,0,1,0,1,0,1,1), ncol =3)
>> weight <- c(0.2, 0.1, 1, 0.8, 0.7)
>> glm(status ~ SNPs, weights = weight, family = binomial)
>
> Call:  glm(formula = status ~ SNPs, family = binomial, weights = weight)
>
> Coefficients:
> (Intercept)        SNPs1        SNPs2        SNPs3
>     -2.079       42.282      -18.964           NA
>
> Degrees of Freedom: 4 Total (i.e. Null);  2 Residual
> Null Deviance:      3.867
> Residual Deviance: 0.6279       AIC: 6.236
> Warning messages:
> 1: non-integer #successes in a binomial glm! in: eval(expr, envir,
> enclos)
> 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y
> = Y, weights = weights, start = start, etastart = etastart,
>
> NB I do not get warning (2) for my data so I'll completely disregard it.
>
> Warning (1) looks suspiciously like a multiplication of my C/C status by
> the weights... what exacly is glm doing with the weight vector?

Using it in the GLM definition.  If you specify 0<=y_i<=1 and weights a_i, 
this is how you specify Binomial(a_i, a_iy_i).  Look up any book on GLMs 
and see what it says about the binomial.  E.g. MASS4 pp. 184, 190.

> In any case, how would I go about weighting my individuals in a logistic
> regression?

Use the cbind(yes, no) form of specification.  Note though that the 
`weights' in a GLM are case weights and not arbitrary downweighting 
factors and aspects of the output (e.g. AIC, anova) depend on this.  A 
different implementation of (differently) weighted GLM is svyglm() in 
package 'survey'.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list