[R] Servreg $loglik

Tue Jul 20 18:42:40 CEST 2010

David:

Thank you for your comments.  

I do understand that absolute values of likelihoods by themselves aren't
meaningful, and only gain meaning when compared with others computed using
the same model but with differing parameter values (for example). That is
why I compute likelihoods myself for confidence interval construction using
the loglikelihood ratio criterion.  But when my plain-vanilla
max(loglikelihood) didn't agree with that reported by survreg() (except when
the data are unweighted) I was afraid I overlooked something.  I am still
puzzled that the servreg(), using the same model with the same data (37
weighted and the corresponding 70 unweighted) produces different values for
loglik.

Thank you for your help.  It gives me peace of mind.  ~:-)

Charles Annis, P.E.

Charles.Annis at StatisticalEngineering.com
561-352-9699
http://www.StatisticalEngineering.com

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Tuesday, July 20, 2010 12:27 PM
To: Charles.Annis at statisticalengineering.com
Cc: r-help at r-project.org
Subject: Re: [R] Servreg $loglik

On Jul 20, 2010, at 11:20 AM, Charles Annis, P.E. wrote:

> Dear R-experts:
>
> I am using survreg() to estimate the parameters of a Weibull density  
> having
> right-censored observations.  Some observations are weighted.  To do  
> that I
> regress the weighed observations against a column of ones.
>
> When I enter the data as 37 weighted observations, the parameter  
> estimates
> are exactly the same as when I enter the data as the corresponding 70
> unweighted observations.  This is to be expected, of course.
>
> I don't understand, however, why the reported loglikelihood is
>> parameter.estimates$loglik
> [1] -120.4699 -120.4699
> for the 37 weighted observations, but
>> parameter.estimates$loglik
> [1] -135.1527 -135.1527
> for the 70 unweighted observations.

This has come up on r-help many times before (and probably on other  
lists as well), despite not being an R question at all. It is  
commonplace in modeling grouped data to see likelihoods reported  
differently from the result obtained when modeling ungrouped data  
representations with the same frequencies. The only valid statistical  
process is to compare differences in the likelihoods (or log(L) ),  
since the likelihood (or log(L) ) is only defined up to an arbitrary  
constant. You need to be comparing the result to some sort of "null  
model" for it to have any meaning. (... or perhaps that is your null  
model and you need to be looking at the impact of adding a covariate  
or two.)

-- 
David.

>
> (For the record, my computations of the loglikelihood, using the  
> dweibull()
> function for the observations and the pweibull() function for the  
> censored
> observations, is -135.1527 for both 37 weighted and 70 unweighted.)
>
> I am using the data from Meeker and Escobar, _Statistical Methods for
> Reliability Data_, Wiley (1998), Table C.1, shown below:
>
> Hours	Status	Num.Parts
>  450	Failure	1
>  460	R-Censored	1
> 1150	Failure	2
> 1560	R-Censored	1
> 1600	Failure	1
> 1660	R-Censored	1
> 1850	R-Censored	5
> 2030	R-Censored	3
> 2070	Failure	2
> 2080	Failure	1
> 2200	R-Censored	1
> 3000	R-Censored	4
> 3100	Failure	1
> 3200	R-Censored	1
> 3450	Failure	1
> 3750	R-Censored	2
> 4150	R-Censored	4
> 4300	R-Censored	4
> 4600	Failure	1
> 4850	R-Censored	4
> 5000	R-Censored	3
> 6100	R-Censored	3
> 6100	Failure	1
> 6300	R-Censored	1
> 6450	R-Censored	2
> 6700	R-Censored	1
> 7450	R-Censored	1
> 7800	R-Censored	2
> 8100	R-Censored	2
> 8200	R-Censored	1
> 8500	R-Censored	3
> 8750	R-Censored	2
> 8750	Failure	1
> 9400	R-Censored	1
> 9900	R-Censored	1
> 10100	R-Censored	3
> 11500	R-Censored	1
>
> I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box  
> with 8 gig
> RAM.
>
>
> Thank you for your help.
>
> Charles Annis, P.E.
>
> Charles.Annis at StatisticalEngineering.com
> 561-352-9699
> http://www.StatisticalEngineering.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT