[R] univariate normal mixtures

Thu Jul 17 19:34:43 CEST 2003

	  Did you search "www.r-help.org" -> Search -> "R Site Search" and 
S-News (Google -> S-News -> S-News Mailing List Archives)?

	  I recently estimated a normal mixture, 2 components, mean 0, 
different variances.  I had computational difficulties until I took the 
time to avoid under and overflows, preserving numerical precision.  I 
got strange answers in my application, which convinced me that I had a 
3-component mixture, not 2, but it was not sufficiently important for me 
to modify my code to allow more than 2 components.

	  The code I used follows, in case it might be of interest to someone.

hope this helps.  spencer graves
#######################
dnorm2.0 <-
function(x, log.sd1=0.5*log(var(x)/10),
	log.sd2=0.5*log(10*var(x)), logit.w2=0,
	output.all=F, log.p=T){
# 	log(dnorm(...)) for both densities
	lg.dn1 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd1))^2)-log.sd1)
	lg.dn2 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd2))^2)-log.sd2)
#	Adjust one to use a negative exponential
#  in the denominator for
#  w2 = exp(logit.w2)/(1+exp(logit.w2))
#     = 1/(1+exp(-logit.w2))
	if(logit.w2>=0){
		lg.dn1 <- (lg.dn1-logit.w2)
		logit.w2 <- (-logit.w2)
	}
	else
		lg.dn2 <- (lg.dn2+logit.w2)
#	Now write log(exp(lg.dn1)+exp(lg.dn2))
#   = lg12 + log(1+exp(lg.12-lg12))
	lg12 <- pmax(lg.dn1, lg.dn2)
	lg.12 <- pmin(lg.dn1, lg.dn2)
#	Put it all together	
	lg.d2 <- (lg12 + log(1+exp(lg.12-lg12))
		 + logit.w2)
	if(log.p)lg.d2 else exp(lg.d2)
}

lglk.dn2 <-
function(x=c(log.sd1=log(1e-15), log.sd2=0, logit.w2=log(35/(127-35))),
	data.){
#		
	dn <- dnorm2.0(data., log.sd1=x[1],
		 log.sd2=x[2], logit.w2=x[3])
	(-sum(dn))
}


Carlos J. Gil Bellosta wrote:
> Well,
> 
> If k is known, you can use maximun likelihood to fit the weights, means, 
> and sd's. The EM algorithm can be of help to solve the optimization 
> problem. You would have to implement it yourself for your particular 
> case, but I do not think it is big trouble.
> 
> Then you could estimate k using Bayesian formalism: from a reasonable a 
> priory distribution on k=1, 2,... compute the posterior distributions 
> using the densities obtained above, etc.
> 
> Carlos J. Gil Bellosta
> Sigma Consultores Estadísticos
> http://www.consultoresestadisticos.com
> 
> Joke Allemeersch wrote:
> 
>> Hello,
>>
>> I have a concrete statistical question:
>> I have a sample of an univariate mixture of an unknown number (k) of 
>> normal distributions, each time with an unknown mean `m_i' and a 
>> standard deviation `k * m_i', where k is known factor constant for all 
>> the normal distributions. (The `i' is a subscript.)
>> Is there a function in R that can estimate the number of normal 
>> distributions k and the means `m_i' for the different normal 
>> distributions from a sample?  Or evt. a function that can estimate the 
>> `m_i', when the number of distributions `k' is known?
>> So far I only found a package, called `normix'.  But at first sight it 
>> only provides methods to sample from such distributions and to 
>> estimate the densities; but not to fit such a distribution.
>> Can someone indicate where I can find an elegant solution?
>>
>> Thank you in advance
>>
>> Joke Allemeersch
>>
>> Katholieke universiteit Leuven.
>> Belgium.
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help