[R] Weights in binomial glm

Fri Apr 16 16:31:45 CEST 2010

Jan,

You misread the documentation of ?glm. Note that glm works with different kinds of families. So the first statement about weights is rather general: it holds for most of the families. It explicitly tells you that is not the case with the binomial family. From the documentation: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes". Nothing more, nothing less.

Scaling the weights will change the results because you change the NUMBER OF TRIALS. More trials = more information = lower variances. So you only need to give the weights when the response is expressed as a ratio. If you have it as a binary variable or as cbind(NummerOfSuccesses,NumberOfFailures) then you don't need weights.

Thierry

----------------------------------------------------------------------------
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

> -----Oorspronkelijk bericht-----
> Van: Jan van der Laan [mailto:djvanderlaan op gmail.com] 
> Verzonden: vrijdag 16 april 2010 16:09
> Aan: ONKELINX, Thierry
> CC: r-help op r-project.org
> Onderwerp: Re: [R] Weights in binomial glm
> 
> Thierry,
> 
> Thank you for your answer.
> 
> From the documentation it looks like it is valid to assume 
> that the weights can be used for replicate weights.
> Continuing your example:
> 
> dataset$Success2 <- dataset$Success
> Aggregated2 <- cast(Person+Success ~ ., data = dataset, value 
> = "Success2", fun =list(mean, length))
> m2 <- glm(mean ~ 1, data = Aggregated2, family = binomial, 
> weights =length)
> 
> In this case the weights can be seen as replicate weights. In 
> my case the proportion of successes for each group is either 0 or 1.
> 
> I am familiar with the survey package. However, in this case 
> there should not be difference between the two as far as the 
> parameter estimates are concerned (the standard errors are 
> incorrect for glm).
> 
> The strange thing in this case is that the estimates seem to 
> depend on the scaling of the weights, which should not be the 
> case. Also in your example scaling the weights gives the same 
> estimate:
> 
> m1 <- glm(mean ~ 1, data = Aggregated, family = binomial, 
> weights = length/10)
> 
> Regards,
> Jan
> 
> 
> 
> On Fri, Apr 16, 2010 at 3:19 PM, ONKELINX, Thierry
> <Thierry.ONKELINX op inbo.be> wrote:
> > Jan,
> >
> > It looks like you did not understand the line "For a binomial GLM 
> > prior weights are used to give the number of trials when 
> the response 
> > is the proportion of successes."
> >
> > Weights must be a number of trials (hence integer). Not a 
> proportion 
> > of a population. Here is an example that clarifies the use 
> of weights.
> >
> > library(boot)
> > library(reshape)
> > dataset <- data.frame(Person = c(rep("A", 20), rep("B", 
> 10)), Success 
> > = c(rbinom(20, 1, 0.25), rbinom(10, 1, 0.75))) Aggregated <- 
> > cast(Person ~ ., data = dataset, value = "Success", fun = 
> list(mean, 
> > length))
> >
> > m0 <- glm(Success ~ 1, data = dataset, family = binomial)
> > m1 <- glm(mean ~ 1, data = Aggregated, family = binomial, weights =
> > length)
> >
> > inv.logit(coef(m0))
> > inv.logit(coef(m1))
> >
> > Have a look at the survey package is you want to analyse stratified 
> > data.
> >
> > Thierry
> >
> > 
> ----------------------------------------------------------------------
> > --
> > ----
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek team Biometrie & 
> Kwaliteitszorg 
> > Gaverstraat 4 9500 Geraardsbergen Belgium
> >
> > Research Institute for Nature and Forest team Biometrics & Quality 
> > Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
> >
> > tel. + 32 54/436 185
> > Thierry.Onkelinx op inbo.be
> > www.inbo.be
> >
> > To call in the statistician after the experiment is done may be no 
> > more than asking him to perform a post-mortem examination: 
> he may be 
> > able to say what the experiment died of.
> > ~ Sir Ronald Aylmer Fisher
> >
> > The plural of anecdote is not data.
> > ~ Roger Brinner
> >
> > The combination of some data and an aching desire for an 
> answer does 
> > not ensure that a reasonable answer can be extracted from a 
> given body 
> > of data.
> > ~ John Tukey
> >
> >
> >> -----Oorspronkelijk bericht-----
> >> Van: r-help-bounces op r-project.org
> >> [mailto:r-help-bounces op r-project.org] Namens Jan van der Laan
> >> Verzonden: vrijdag 16 april 2010 14:11
> >> Aan: r-help op r-project.org
> >> Onderwerp: [R] Weights in binomial glm
> >>
> >> I have some questions about the use of weights in binomial 
> glm as I 
> >> am not getting the results I would expect. In my case the 
> weights I 
> >> have can be seen as 'replicate weights'; one respondent i in my 
> >> dataset corresponds to w[i] persons in the population. From the 
> >> documentation of the glm method, I understand that the weights can 
> >> indeed be used for this: "For a binomial GLM prior weights 
> are used 
> >> to give the number of trials when the response is the 
> proportion of 
> >> successes."
> >> >From "Modern applied statistics with S-Plus 3rd ed." I understand 
> >> >the
> >> same.
> >>
> >
> > Druk dit bericht a.u.b. niet onnodig af.
> > Please do not print this message unnecessarily.
> >
> > Dit bericht en eventuele bijlagen geven enkel de visie van de 
> > schrijver weer en binden het INBO onder geen enkel beding, 
> zolang dit 
> > bericht niet bevestigd is door een geldig ondertekend document. The 
> > views expressed in  this message and any annex are purely 
> those of the 
> > writer and may not be regarded as stating an official position of 
> > INBO, as long as the message is not confirmed by a duly 
> signed document.
> >
> 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.