[R] How to obtain individual log-likelihood value from glm?

Sat Aug 29 18:39:20 CEST 2020

Dear John,

On 2020-08-29 11:18 a.m., John Smith wrote:
> Thanks for very insightful thoughts. What I am trying to achieve with the
> weights is actually not new, something like
> https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
> I thought my inquiry was not too strange, and I could utilize some existing
> codes. It is just an optimization problem at the end of day, or not? Thanks

So the object is to fit a regularized (i.e, penalized) logistic 
regression rather than to fit by ML. glm() won't do that.

I took a quick look at the stackexchange link that you provided and the 
document referenced in that link.  The penalty proposed in the document 
is just a multiple of the sum of squared regression coefficients, what 
usually called an L2 penalty in the machine-learning literature.  There 
are existing implementations of regularized logistic regression in R -- 
see the machine learning CRAN taskview 
<https://cran.r-project.org/web/views/MachineLearning.html>. I believe 
that the penalized package will fit a regularized logistic regression 
with an L2 penalty.

As well, unless my quick reading was inaccurate, I think that you, and 
perhaps the stackexchange poster, might have been confused by the 
terminology used in the document: What's referred to as "weights" in the 
document is what statisticians more typically call "regression 
coefficients," and the "bias weight" is the "intercept" or "regression 
constant." Perhaps I'm missing some connection -- I'm not the best 
person to ask about machine learning.

Best,
  John

> 
> On Sat, Aug 29, 2020 at 9:02 AM John Fox <jfox using mcmaster.ca> wrote:
> 
>> Dear John,
>>
>> On 2020-08-29 1:30 a.m., John Smith wrote:
>>> Thanks Prof. Fox.
>>>
>>> I am curious: what is the model estimated below?
>>
>> Nonsense, as Peter explained in a subsequent response to your prior
>> posting.
>>
>>>
>>> I guess my inquiry seems more complicated than I thought: with y being
>> 0/1, how to fit weighted logistic regression with weights <1, in the sense
>> of weighted least squares? Thanks
>>
>> What sense would that make? WLS is meant to account for non-constant
>> error variance in a linear model, but in a binomial GLM, the variance is
>> purely a function for the mean.
>>
>> If you had binomial (rather than binary 0/1) observations (i.e.,
>> binomial trials exceeding 1), then you could account for overdispersion,
>> e.g., by introducing a dispersion parameter via the quasibinomial
>> family, but that isn't equivalent to variance weights in a LM, rather to
>> the error-variance parameter in a LM.
>>
>> I guess the question is what are you trying to achieve with the weights?
>>
>> Best,
>>    John
>>
>>>
>>>> On Aug 28, 2020, at 10:51 PM, John Fox <jfox using mcmaster.ca> wrote:
>>>>
>>>> Dear John
>>>>
>>>> I think that you misunderstand the use of the weights argument to glm()
>> for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
>> to give the number of trials when the response is the proportion of
>> successes." That is, in this case y should be the observed proportion of
>> successes (i.e., between 0 and 1) and the weights are integers giving the
>> number of trials for each binomial observation.
>>>>
>>>> I hope this helps,
>>>> John
>>>>
>>>> John Fox, Professor Emeritus
>>>> McMaster University
>>>> Hamilton, Ontario, Canada
>>>> web: https://socialsciences.mcmaster.ca/jfox/
>>>>
>>>>> On 2020-08-28 9:28 p.m., John Smith wrote:
>>>>> If the weights < 1, then we have different values! See an example
>> below.
>>>>> How  should I interpret logLik value then?
>>>>> set.seed(135)
>>>>>    y <- c(rep(0, 50), rep(1, 50))
>>>>>    x <- rnorm(100)
>>>>>    data <- data.frame(cbind(x, y))
>>>>>    weights <- c(rep(1, 50), rep(2, 50))
>>>>>    fit <- glm(y~x, data, family=binomial(), weights/10)
>>>>>    res.dev <- residuals(fit, type="deviance")
>>>>>    res2 <- -0.5*res.dev^2
>>>>>    cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>>>>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard <pdalgd using gmail.com>
>> wrote:
>>>>>> If you don't worry too much about an additive constant, then half the
>>>>>> negative squared deviance residuals should do. (Not quite sure how
>> weights
>>>>>> factor in. Looks like they are accounted for.)
>>>>>>
>>>>>> -pd
>>>>>>
>>>>>>> On 25 Aug 2020, at 17:33 , John Smith <jswhct using gmail.com> wrote:
>>>>>>>
>>>>>>> Dear R-help,
>>>>>>>
>>>>>>> The function logLik can be used to obtain the maximum log-likelihood
>>>>>> value
>>>>>>> from a glm object. This is an aggregated value, a summation of
>> individual
>>>>>>> log-likelihood values. How do I obtain individual values? In the
>>>>>> following
>>>>>>> example, I would expect 9 numbers since the response has length 9. I
>>>>>> could
>>>>>>> write a function to compute the values, but there are lots of
>>>>>>> family members in glm, and I am trying not to reinvent wheels.
>> Thanks!
>>>>>>>
>>>>>>> counts <- c(18,17,15,20,10,20,25,13,12)
>>>>>>>       outcome <- gl(3,1,9)
>>>>>>>       treatment <- gl(3,3)
>>>>>>>       data.frame(treatment, outcome, counts) # showing data
>>>>>>>       glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>>>>>>>       (ll <- logLik(glm.D93))
>>>>>>>
>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>> --
>>>>>> Peter Dalgaard, Professor,
>>>>>> Center for Statistics, Copenhagen Business School
>>>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>>>> Phone: (+45)38153501
>>>>>> Office: A 4.23
>>>>>> Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>      [[alternative HTML version deleted]]
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>