[R] Maximum likelihood estimation (stats4::mle)

Ronald Kölpin ronald.koelpin at gmail.com
Tue Jul 22 21:01:37 CEST 2014


Thanks, that was exactly it -- switching the values did the trick (and
was actually correct in terms of theory.) And of course, you are right
-- i changed the starting values to mean(x) - mean(y) for mu and
sqrt(var(x-y)) for sigma.

I also see your point about the theoretical justification for the
specific form of the likelihood function: F(x) - F(y) is supposed to
express the probability that a certain lies in the intervall (x,y),
that is x, y are takes as lower and upper bounds for the estimate
respectively. I think the theoretical justification is sound, but i'm
still trying to work out the details.

Thanks a lot anyways for solving my coding problem!

RK

On 22.07.2014 15:26, peter dalgaard wrote:
> 
> On 22 Jul 2014, at 06:04 , David Winsemius <dwinsemius at comcast.net>
> wrote:
> 
>> 
>> On Jul 21, 2014, at 12:10 PM, Ronald Kölpin wrote:
>> 
>>> Dear R-Community,
>>> 
>>> I'm trying to estimate the parameters of a probability
>>> distribution function by maximum likelihood estimation (using
>>> the stats4 function mle()) but can't seem to get it working.
>>> 
>>> For each unit of observation I have a pair of observations (a,
>>> r) which I assume (both) to be log-normal distributed (iid).
>>> Taking the log of both values I have (iid) normally distributed
>>> random variables and the likelihood function to be estimated
>>> is:
>>> 
>>> L = Product(F(x_i) - F(y_i), i=1..n)
>>> 
>>> where F is the Normal PDF and (x,y) := (log(a), log(r)). Taking
>>> the log and multiplying by -1 gives the negative loglikelihood
>> 
>> I don't see the need to multiply by -1. The log of any
>> probability is of necessity less than (or possibly equal to) 0
>> since probabilities are bounded above by 1.
> 
> 
> Well, mle() wants to minimize -log L by definition. That's just
> because optimizers tend to prefer to have an objective function to
> minimize rather than maximize.
> 
> However, what is keeping the terms of Product(F(x_i) - F(y_i),
> i=1..n) from going negative? As far as I can tell, the x_i are
> bigger than the corresponding y_i, and F is an increasing function
> so all terms are negative and the log of each term is undefined.
> Switching the order should help.
> 
> Also, what is the rationale for this form of likelihood? Surely not
> that the x,y pairs are independent lognormals. Looks more like what
> you'd get from interval censored data. If so, it should be possible
> to come up with better starting values than mu=0, sd=1.
> 
> -pd
> 
> 
>> So sums of these number will be negative which then allows you to
>> maximize their sums.
>> 
>>> 
>>> l = Sum(log( F(x_i) - F(y_i) ), i=1..n)
>>> 
>>> However estimation by mle() produces the error "vmmin is not
>>> finite"
>> 
>> 
>> As I would have predicted. If one maximizes numbers that get
>> larger as probabilities get small this is what would be
>> expected.
>> 
>> -- David.
>> 
>>> and "NaN have been created" - even though put bound on the
>>> parameters mu and sigma (see code below).
>>> 
>>> 
>>> library("stats4")
>>> 
>>> gaps <- matrix(nrow=10, ncol=4, dimnames=list(c(1:10),c("r_i",
>>> "a_i", "log(r_i)", "log(a_i)"))) gaps[,1] <- c(2.6, 1.4, 2.2,
>>> 2.9, 2.9, 1.7, 1.3, 1.7, 3.8, 4.5) gaps[,2] <- c(9.8, 20.5,
>>> 8.7, 7.2, 10.3, 11, 4.5, 5.2, 6.7, 7.6) gaps[,3] <-
>>> log(gaps[,1]) gaps[,4] <- log(gaps[,2])
>>> 
>>> nll <- function(mu, sigma) { if(sigma >= 0 && mu >= 0) { 
>>> -sum(log(pnorm(gaps[,3], mean=mu, sd=sigma) - pnorm(gaps[,4], 
>>> mean=mu, sd=sigma))) } else { NA } }
>>> 
>>> fit <- mle(nll, start=list(mu=0, sigma=1), nobs=10) print(fit)
>>> 
>>> 
>>> To be honest, I'm stumped and don't quite know what the problem
>>> is...
>>> 
>>> Regards and Thanks,
>>> 
>>> Ronald Kölpin
>>> 
>>> ______________________________________________ 
>>> R-help at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
>>> posting guide http://www.R-project.org/posting-guide.html and
>>> provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius Alameda, CA, USA
>> 
>> ______________________________________________ 
>> R-help at r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
>> posting guide http://www.R-project.org/posting-guide.html and
>> provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list