[R] How to obtain individual log-likelihood value from glm?

John Smith j@whct @end|ng |rom gm@||@com
Sat Aug 29 19:23:42 CEST 2020


In the book Modern Applied Statistics with S, 4th edition, 2002, by
Venables and Ripley, there is a function logitreg on page 445, which does
provide the weighted logistic regression I asked, judging by the loss
function. And interesting enough, logitreg provides the same coefficients
as glm in the example I provided earlier, even with weights < 1. Also for
residual deviance, logitreg yields the same number as glm. Unless I
misunderstood something, I am convinced that glm is a valid tool for
weighted logistic regression despite the description on weights and somehow
questionable logLik value in the case of non-integer weights < 1. Perhaps
this is a bold claim: the description of weights can be modified and logLik
can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not the
link in that post. Sorry for the confusion.

On Sat, Aug 29, 2020 at 10:18 AM John Smith <jswhct using gmail.com> wrote:

> Thanks for very insightful thoughts. What I am trying to achieve with the
> weights is actually not new, something like
> https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
> I thought my inquiry was not too strange, and I could utilize some existing
> codes. It is just an optimization problem at the end of day, or not? Thanks
>
> On Sat, Aug 29, 2020 at 9:02 AM John Fox <jfox using mcmaster.ca> wrote:
>
>> Dear John,
>>
>> On 2020-08-29 1:30 a.m., John Smith wrote:
>> > Thanks Prof. Fox.
>> >
>> > I am curious: what is the model estimated below?
>>
>> Nonsense, as Peter explained in a subsequent response to your prior
>> posting.
>>
>> >
>> > I guess my inquiry seems more complicated than I thought: with y being
>> 0/1, how to fit weighted logistic regression with weights <1, in the sense
>> of weighted least squares? Thanks
>>
>> What sense would that make? WLS is meant to account for non-constant
>> error variance in a linear model, but in a binomial GLM, the variance is
>> purely a function for the mean.
>>
>> If you had binomial (rather than binary 0/1) observations (i.e.,
>> binomial trials exceeding 1), then you could account for overdispersion,
>> e.g., by introducing a dispersion parameter via the quasibinomial
>> family, but that isn't equivalent to variance weights in a LM, rather to
>> the error-variance parameter in a LM.
>>
>> I guess the question is what are you trying to achieve with the weights?
>>
>> Best,
>>   John
>>
>> >
>> >> On Aug 28, 2020, at 10:51 PM, John Fox <jfox using mcmaster.ca> wrote:
>> >>
>> >> Dear John
>> >>
>> >> I think that you misunderstand the use of the weights argument to
>> glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are
>> used to give the number of trials when the response is the proportion of
>> successes." That is, in this case y should be the observed proportion of
>> successes (i.e., between 0 and 1) and the weights are integers giving the
>> number of trials for each binomial observation.
>> >>
>> >> I hope this helps,
>> >> John
>> >>
>> >> John Fox, Professor Emeritus
>> >> McMaster University
>> >> Hamilton, Ontario, Canada
>> >> web: https://socialsciences.mcmaster.ca/jfox/
>> >>
>> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> >>> If the weights < 1, then we have different values! See an example
>> below.
>> >>> How  should I interpret logLik value then?
>> >>> set.seed(135)
>> >>>   y <- c(rep(0, 50), rep(1, 50))
>> >>>   x <- rnorm(100)
>> >>>   data <- data.frame(cbind(x, y))
>> >>>   weights <- c(rep(1, 50), rep(2, 50))
>> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
>> >>>   res.dev <- residuals(fit, type="deviance")
>> >>>   res2 <- -0.5*res.dev^2
>> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>> >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard <pdalgd using gmail.com>
>> wrote:
>> >>>> If you don't worry too much about an additive constant, then half the
>> >>>> negative squared deviance residuals should do. (Not quite sure how
>> weights
>> >>>> factor in. Looks like they are accounted for.)
>> >>>>
>> >>>> -pd
>> >>>>
>> >>>>> On 25 Aug 2020, at 17:33 , John Smith <jswhct using gmail.com> wrote:
>> >>>>>
>> >>>>> Dear R-help,
>> >>>>>
>> >>>>> The function logLik can be used to obtain the maximum log-likelihood
>> >>>> value
>> >>>>> from a glm object. This is an aggregated value, a summation of
>> individual
>> >>>>> log-likelihood values. How do I obtain individual values? In the
>> >>>> following
>> >>>>> example, I would expect 9 numbers since the response has length 9. I
>> >>>> could
>> >>>>> write a function to compute the values, but there are lots of
>> >>>>> family members in glm, and I am trying not to reinvent wheels.
>> Thanks!
>> >>>>>
>> >>>>> counts <- c(18,17,15,20,10,20,25,13,12)
>> >>>>>      outcome <- gl(3,1,9)
>> >>>>>      treatment <- gl(3,3)
>> >>>>>      data.frame(treatment, outcome, counts) # showing data
>> >>>>>      glm.D93 <- glm(counts ~ outcome + treatment, family =
>> poisson())
>> >>>>>      (ll <- logLik(glm.D93))
>> >>>>>
>> >>>>>        [[alternative HTML version deleted]]
>> >>>>>
>> >>>>> ______________________________________________
>> >>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>> PLEASE do read the posting guide
>> >>>> http://www.R-project.org/posting-guide.html
>> >>>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>>
>> >>>> --
>> >>>> Peter Dalgaard, Professor,
>> >>>> Center for Statistics, Copenhagen Business School
>> >>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> >>>> Phone: (+45)38153501
>> >>>> Office: A 4.23
>> >>>> Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>     [[alternative HTML version deleted]]
>> >>> ______________________________________________
>> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list