[R] MLE, precision

Spencer Graves spencer.graves at pdf.com
Tue Jul 13 17:30:39 CEST 2004

      Have you considered estimating ln.m1, ln.m2, and ln.b, which makes 
the negative log likelihood something like the following: 

l.ln<- function(ln.m1,ln.m2,ln.b){
    m1 <- exp(ln.m1); m2 <- exp(ln.m2); b <- exp(ln.b)
    lglk <- d*( ln.m1 + ln.m2        
         + log1p(-exp(-(b+m2)*t)
         + (m1/b-d)*log(m2+b*exp(-b+m2)*t))
          + m1*t - m1/b*log(b+m2) )


	  I don't know if I have this correct, but you should get the idea.  Parameterizing in terms of logarithms automatically eliminates the constraints that m1, m2, and b must be positive.  

	  I also prefer to play with the function until I'm reasonably confident it will never produce NAs, and I use a few tricks to preserve numerical precision where I can.  For example, log(b+m2) = log(b) + log1p(m2/b) = log(m2) + log1p(b/m2).  If you use the first form when b is larger and the second when m1 is larger, you should get more accurate answers.  Using, e.g.:  

	  log.apb <- function(log.a, log.b){
	  	  min.ab <- pmin(log.a, log.b)
		  max.ab <- pmax(log.a, log.b)
	  	  max.ab + log1p(exp(min.ab-max.ab))

If log.a and log.b are both really large, a and b could be Inf when log.a and log.b are finite.  Computing log(a+b) like this eliminates that problem.  The same problem occurs when log.a and log.b are so far negative that a and b are both numerically 0, even though log.a and log.b are very finite.  This function eliminates that problem.  

	  Also, have you tried plotting your "l" vs. m1 with m2 and b constant, and then vs. m2 with m2 and b constant and vs. b with m1 and m2 constant?  Or (better) make contour plots of "l" vs. any 2 of these parameters with the other held constant.  When I've done this in crudely similar situations, I've typically found that the log(likelihood) was more nearly parabolic in terms of ln.m1, ln.m2, and ln.b than in terms of the untransformed variables.  This means that the traditional Wald confidence region procedures are more accurate, as they assume that the log(likelihood) is parabolic in the parameters estimated.  

	  hope this  helps.  spencer graves
p.s.  I suggest you avoid using "t" as a variable name:  That's the name of the function for the transpose of a matrix.  R and usually though not always tell from the context what you want.  However, it's best to avoid that ambiguity.  I often test at a command prompt variable names I want to use.  If the response is "object not found", then I feel like I can use it.  

boshao zhang wrote:

>Hi, everyone
>I am trying to estimate 3 parameters for my survival
>function. It's very complicated. The negative
>loglikelihood function is:
>l<- function(m1,m2,b)  -sum(    d*( log(m1) + log(m2)
>+ log(1- exp(-(b + m2)*t)) ) + (m1/b - d)*log(m2 +
>b*exp(-(b + m2)*t) ) + m1*t - m1/b*log(b+m2)      )
>here d and t are given, "sum"  means sum over these
>two vairables. 
>the parameters are assumed small, m1, m2 in
>thousandth, m2 in millionth.
>I used the function "nlm" to estimate m1,m2,b. But the
>result is very bad. you can get more than 50 warnings,
>most of them are about "negative infinity"in log. And
>the results are initial value dependent, or you will
>get nothing when you choose some values.
>So I tried brutal force, i.e. evaluate the values of
>grid point. It works well. Also, you can get the
>correct answer of log(1e-12).
>My questions are:
> What is the precision of a variable in R?
> How to specify the constraint interval of parameters
>in nlm? I tried lower, upper, it doesn't work.
>any advice on MLE is appreciated.
>Thank you.
>R-help at stat.math.ethz.ch mailing list
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

More information about the R-help mailing list