[R] EM algorithm for Missing Data.

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Mon Jul 9 10:19:00 CEST 2007

On 09-Jul-07 02:20:47, Marcus Vinicius wrote:
>  Dear all,
> I need to use the EM algorithm where data are missing.
> Example:
> x<- c(60.87, NA, 61.53, 72.20, 68.96, NA, 68.35, 68.11, NA, 71.38)
> May anyone help me?
> Thanks.
> Marcus Vinicius

The Dempster, Laird & Rubin reference given by Simon Blomberg
is the classical account of the EM Algorithm for incomplete
information, though there has been a lot more published since.

However, more to the point in the present case: If the above
is typical of your data, you had better state what you want to
do with the data.

Do you want to fit a distribution by estimating parameters?
Are they observations of a "response" variable with covariates
and you want to fit a linear model estimating the coefficients?
Are they data from a time-series and you need to interpolate
at the missing values?

Depending on what you want to do, the way you apply the general
EM Algorithm procedure may be very different; and a lot of
applications are not covered by Dempster, Laird & Rubin (1977).

And there may possibly be no point anyway: If all you want to do
is estimate the mean of the distribution of the data, then the
best procedure may simply be to ignore the missing data.

Best wishes,

E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 09-Jul-07                                       Time: 09:18:56
------------------------------ XFMail ------------------------------

More information about the R-help mailing list