[R] Questions about EM algorithm

Wed Feb 20 05:34:24 CET 2008

G'day Sean,

On Fri, 15 Feb 2008 09:12:22 +0800
"Hung-Hsuan Chen (Sean)" <sandwichc at gmail.com> wrote:

> Assume I have 3 distributions, x1, x2, and x3.
> x1 ~ normal(mu1, sd1)
> x2 ~ normal(mu2, sd2)
> x3 ~ normal(mu3, sd3)
> y1 = x1 + x2
> y2 = x1 + x3
> 
> Now that the data I can observed is only y1 and y2. It is
> easy to estimate (mu1+m2), (mu1+mu3), (sd1^2+sd2^2) and
> (sd1^2+sd3^2) by EM algorithm since

Isn't it a bit of an overkill to use an EM algorithm here?  There are
explicit formula for the estimators (namely the sample average and
the sample variance) of those quantities.  O.k., these formula may not
yield MLE, but it should be very easy to correct for that.  

> y1 ~ normal(mu1+mu2, sqrt(sd1^2+sd2^2)) and
> y2 ~ normal(mu1+mu3, sqrt(sd1^2+sd3^2))
> 
> However, I want to estimate mu1, mu2, mu3, sd1, sd2, and sd3.
> Is it possible to do so by EM algorithm (or any other estimation
> methods like gibbs sampling or MLE) ?

EM algorithms are a way of calculating MLEs by framing the problem
(explicitly or implicitly) in a missing data context.  So "EM
algorithm" or "MLE" are not different methods.  The former is a way of
calculating the latter; of course, the latter can also be calculated by
directly maximising the (log)likelihood function.

You did not say so explicitly, but I guess you are assuming that x1, x2
and x3 are independent, are you?  At least under this assumption it is
easy to deduce that the distribution of y1 and y2 are as you stated.
If you do not assume independence of x1, x2, x3, what other assumptions
do you do to arrive at these distributions for y1 and y2?

Under the assumption of independence of x1, x2, x3, one would also have
that Cov(y1,y2)=sd1^2.  Together with the the fact that
Var(y1)=sd1^2+sd2^2 and Var(y2)=sd1^2+sd3^2, this makes the three
standard deviations identifiable, and you can readily estimate them.

Actually, if x1, x2 and x3 are independent, then they would be jointly
normal, hence y1 and y2 would be jointly normal, whence it would be easy
to write down the likelihood of the parameters given y1 and y2 and find
the MLEs for sd1, sd2, sd3.

For mu1, mu2 and mu3 you have an identifiable problem, the two triples
(mu1, mu2, mu3) and (mu1+c, mu2-c, mu3-c) (where c is any fixed
number) would yield exactly the same likelihood value.  Hence, these
three parameters are not identifiable.  You would have to fix one of
them arbitrarily, say mu1=0.

Best wishes,

	Berwin 

=========================== Full address =============================
Berwin A Turlach                            Tel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability        +65 6515 6650 (self)
Faculty of Science                          FAX : +65 6872 3919       
National University of Singapore     
6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba