[R] missing values imputation

A.J. Rossini rossini at blindglobe.net
Wed May 12 19:34:30 CEST 2004


Picky, picky.

Details are in the eyes of the beholder.

Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:

> That's not an algorithm.  It is a recipe for deriving an algorithm.
>
>   algorithm - A detailed sequence of actions to perform to accomplish some
>   task. Named after an Iranian mathematician, Al-Khawarizmi.
>
>   Technically, an algorithm must reach a result after a finite number of
>   steps, thus ruling out brute force search methods for certain problems,
>   though some might claim that brute force search was also a valid (generic)  
>   algorithm. The term is also used loosely for any sequence of actions
>   (which may or may not terminate).
>
> Paul E. Black's Dictionary of Algorithms, Data Structures, and Problems.
>
> On Wed, 12 May 2004 Ted.Harding at nessie.mcc.ac.uk wrote:
>
>> On 12-May-04 Rolf Turner wrote:
>> > Anne Piotet wrote:
>> > 
>> >> What R functionnalities are there to do missing values imputation
>> >> (substantial proportion of missing data)?  I would prefer to use
>> >> maximum likelihood methods ; is the EM algorithm implemented? in
>> >> which package?
>> > 
>> >       The so-called ``EM algorithm'' is ***NOT*** an
>> >       algorithm.  It is a methodology or a unifying concept.
>> >       It would be impossible to ``implement'' it.  (Except
>> >       possibly by means of some extremely advanced and
>> >       sophisticated Artificial Intelligence software.)
>> 
>> Do we understand the same thing by "EM Algorithm"?
>> 
>> The one I'm thinking of -- formulated under that name by Dempster,
>> Laird and Rubin in 1977 ("Maximum likelihood estimation from incomplete
>> data via the EM  algorithm", JRSS(B) 39, 1-38) -- is indeed an algorithm
>> in exactly the same sense as any iterative search for the maximum of a
>> function.
>> 
>> Essentially, in the context of data modelled by an underlying exponential
>> family distribution where there is incomplete information about the
>> values which have this distribution, it proceeds by
>> 
>> Start: Choose starting estimates for the parameters of the distribution
>> E: Using the current parameter values, compute the expected vaues
>>    of the sufficient statistics conditional on the observed information
>> M: Solve the maximum-likelihood equations (which are functions of the
>>    sufficient statistics) using the expected values computed in (E)
>> If sufficently converged, stop. Otherwise, make the current parameter
>> values equal to the values estimated in (M) and return to (E).
>> 
>> Algorithm, this, or not????
>> 
>> And where does "extremely advanced and sophisticated Artificial
>> Intelligence software" come into it? You can, in some cases, perform
>> the above EM algorithm by hand.
>> 
>> Which "EM Algorithm" are you thinking of?
>> 
>> Best wishes,
>> Ted.
>> 
>> 
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
>> Fax-to-email: +44 (0)870 167 1972
>> Date: 12-May-04                                       Time: 17:57:53
>> ------------------------------ XFMail ------------------------------
>> 
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>> 
>> 
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}




More information about the R-help mailing list