[R] univariate normal mixtures
spencer.graves at pdf.com
Thu Jul 17 19:34:43 CEST 2003
Did you search "www.r-help.org" -> Search -> "R Site Search" and
S-News (Google -> S-News -> S-News Mailing List Archives)?
I recently estimated a normal mixture, 2 components, mean 0,
different variances. I had computational difficulties until I took the
time to avoid under and overflows, preserving numerical precision. I
got strange answers in my application, which convinced me that I had a
3-component mixture, not 2, but it was not sufficiently important for me
to modify my code to allow more than 2 components.
The code I used follows, in case it might be of interest to someone.
hope this helps. spencer graves
# log(dnorm(...)) for both densities
lg.dn1 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd1))^2)-log.sd1)
lg.dn2 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd2))^2)-log.sd2)
# Adjust one to use a negative exponential
# in the denominator for
# w2 = exp(logit.w2)/(1+exp(logit.w2))
# = 1/(1+exp(-logit.w2))
lg.dn1 <- (lg.dn1-logit.w2)
logit.w2 <- (-logit.w2)
lg.dn2 <- (lg.dn2+logit.w2)
# Now write log(exp(lg.dn1)+exp(lg.dn2))
# = lg12 + log(1+exp(lg.12-lg12))
lg12 <- pmax(lg.dn1, lg.dn2)
lg.12 <- pmin(lg.dn1, lg.dn2)
# Put it all together
lg.d2 <- (lg12 + log(1+exp(lg.12-lg12))
if(log.p)lg.d2 else exp(lg.d2)
function(x=c(log.sd1=log(1e-15), log.sd2=0, logit.w2=log(35/(127-35))),
dn <- dnorm2.0(data., log.sd1=x,
Carlos J. Gil Bellosta wrote:
> If k is known, you can use maximun likelihood to fit the weights, means,
> and sd's. The EM algorithm can be of help to solve the optimization
> problem. You would have to implement it yourself for your particular
> case, but I do not think it is big trouble.
> Then you could estimate k using Bayesian formalism: from a reasonable a
> priory distribution on k=1, 2,... compute the posterior distributions
> using the densities obtained above, etc.
> Carlos J. Gil Bellosta
> Sigma Consultores Estadísticos
> Joke Allemeersch wrote:
>> I have a concrete statistical question:
>> I have a sample of an univariate mixture of an unknown number (k) of
>> normal distributions, each time with an unknown mean `m_i' and a
>> standard deviation `k * m_i', where k is known factor constant for all
>> the normal distributions. (The `i' is a subscript.)
>> Is there a function in R that can estimate the number of normal
>> distributions k and the means `m_i' for the different normal
>> distributions from a sample? Or evt. a function that can estimate the
>> `m_i', when the number of distributions `k' is known?
>> So far I only found a package, called `normix'. But at first sight it
>> only provides methods to sample from such distributions and to
>> estimate the densities; but not to fit such a distribution.
>> Can someone indicate where I can find an elegant solution?
>> Thank you in advance
>> Joke Allemeersch
>> Katholieke universiteit Leuven.
>> R-help at stat.math.ethz.ch mailing list
> R-help at stat.math.ethz.ch mailing list
More information about the R-help