[R] missing data imputation

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sun Jul 10 15:10:40 CEST 2005

(Ted Harding) wrote:

> In many cases people simply treat negative estimates of variables
> which are intrinsically non-negative very crudely: if it comes
> out negative, replaceit with zero. This too is often a quick
> fix where the fact that it is a lie simply has no practical
> importance. But, of course, it may matter! That depends ...
> (see above).

That will result in a strange distribution of imputed values.

. . .

> I've also noted Frank Harrel's comment about aregImpute, and
> will bear it in mind. Note, however, that this does not do
> multiple imputation on the same lines as NORM (or the other
> Shafer-derived MI packages). See ?aregImpute section "Details".
> And, specifically, from the "Description":

It is different, but aregImpute approximates the full Bayesian 
procedure.  MICE is another approach to approximating it, and aregImpute 
seems to agree well with MICE when you force linearity in aregImpute 
(because like NORM, MICE cannot handle nonlinearity).


>   "The 'transcan' function creates flexible additive imputation
>    models but provides only an approximation to true multiple
>    imputation as the imputation models are fixed before all
>    multiple imputations are drawn. This ignores variability
>    caused by having to fit the imputation models. 'aregImpute'
>    takes all aspects of uncertainty in the imputations into
>    account by using the bootstrap to approximate the process
>    of drawing predicted values from a full Bayesian predictive
>    distribution."
> so that the Rubin/Shafer method described above (see paragraph
> about dispersion of imputed values) is not fully implemented.
> Best wishes,
> Ted.

Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

More information about the R-help mailing list