[R] missing values imputation

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 12 19:15:21 CEST 2004


That's not an algorithm.  It is a recipe for deriving an algorithm.

  algorithm - A detailed sequence of actions to perform to accomplish some
  task. Named after an Iranian mathematician, Al-Khawarizmi.

  Technically, an algorithm must reach a result after a finite number of
  steps, thus ruling out brute force search methods for certain problems,
  though some might claim that brute force search was also a valid (generic)  
  algorithm. The term is also used loosely for any sequence of actions
  (which may or may not terminate).

Paul E. Black's Dictionary of Algorithms, Data Structures, and Problems.

On Wed, 12 May 2004 Ted.Harding at nessie.mcc.ac.uk wrote:

> On 12-May-04 Rolf Turner wrote:
> > Anne Piotet wrote:
> > 
> >> What R functionnalities are there to do missing values imputation
> >> (substantial proportion of missing data)?  I would prefer to use
> >> maximum likelihood methods ; is the EM algorithm implemented? in
> >> which package?
> > 
> >       The so-called ``EM algorithm'' is ***NOT*** an
> >       algorithm.  It is a methodology or a unifying concept.
> >       It would be impossible to ``implement'' it.  (Except
> >       possibly by means of some extremely advanced and
> >       sophisticated Artificial Intelligence software.)
> 
> Do we understand the same thing by "EM Algorithm"?
> 
> The one I'm thinking of -- formulated under that name by Dempster,
> Laird and Rubin in 1977 ("Maximum likelihood estimation from incomplete
> data via the EM  algorithm", JRSS(B) 39, 1-38) -- is indeed an algorithm
> in exactly the same sense as any iterative search for the maximum of a
> function.
> 
> Essentially, in the context of data modelled by an underlying exponential
> family distribution where there is incomplete information about the
> values which have this distribution, it proceeds by
> 
> Start: Choose starting estimates for the parameters of the distribution
> E: Using the current parameter values, compute the expected vaues
>    of the sufficient statistics conditional on the observed information
> M: Solve the maximum-likelihood equations (which are functions of the
>    sufficient statistics) using the expected values computed in (E)
> If sufficently converged, stop. Otherwise, make the current parameter
> values equal to the values estimated in (M) and return to (E).
> 
> Algorithm, this, or not????
> 
> And where does "extremely advanced and sophisticated Artificial
> Intelligence software" come into it? You can, in some cases, perform
> the above EM algorithm by hand.
> 
> Which "EM Algorithm" are you thinking of?
> 
> Best wishes,
> Ted.
> 
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 167 1972
> Date: 12-May-04                                       Time: 17:57:53
> ------------------------------ XFMail ------------------------------
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list