[R] multiple imputation with fit.mult.impute in Hmisc

Frank E Harrell Jr fharrell at virginia.edu
Mon Jul 28 17:56:00 CEST 2003


On Mon, 28 Jul 2003 08:18:09 -0400
Jonathan Baron <baron at psych.upenn.edu> wrote:

> Thanks for the quick reply!  One more question, below.
> 
> On 07/27/03 22:20, Frank E Harrell Jr wrote:
> >On Sun, 27 Jul 2003 14:47:30 -0400
> >Jonathan Baron <baron at psych.upenn.edu> wrote:
> >
> >> I have always avoided missing data by keeping my distance from
> >> the real world.  But I have a student who is doing a study of
> >> real patients.  We're trying to test regression models using
> >> multiple imputation.  We did the following (roughly):
> >> 
> >> f <- aregImpute(~ [list of 32 variables, separated by + signs],
> >>  n.impute=20, defaultLinear=T, data=t1)
> >> # I read that 20 is better than the default of 5.
> >> # defaultLinear makes sense for our data.
> >> 
> >> fmp <- fit.mult.impute(Y ~ X1 + X2 ... [for the model of interest],
> >>  xtrans=f, fitter=lm, data=t1)
> >> 
> >> and all goes well (usually) except that we get the following
> >> message at the end of the last step:
> >> 
> >>  Warning message: Not using a Design fitting function;
> >>  summary(fit) will use standard errors, t, P from last imputation
> >>  only.  Use Varcov(fit) to get the correct covariance matrix,
> >>  sqrt(diag(Varcov(fit))) to get s.e.
> >> 
> >> I did try using sqrt(diag(Varcov(fmp))), as it suggested, and it
> >> didn't seem to change anything from when I did summary(fmp).
> >> 
> >> But this Warning message sounds scary.  It sounds like the whole
> >> process of multiple imputation is being ignored, if only the last
> >> one is being used.
> >
> >The warning message may be ignored.  But the advice to use Varcov(fmp) is faulty for 
> >lm fits - I will fix that in the next release of Hmisc.  You may get the 
> >imputation-corrected covariance matrix for now using fmp$var
> 
> Then it seems to me that summary(fmp) is also giving incorrect
> std err.r, t, and p.  Right?  It seems to use Varcof(fmp) and not
> fmp$var.

summary is using the usual lm output, for the last fit, so it is not adjusted for multiple imputation.  Varcov(fmp) is using what summary uses because I forgot to tell Varcov.lm to look for fmp$var first.

Frank

> 
> >> So I discovered I could get rid of this warning by loading the
> >> Design library and then using ols instead of lm as the fitter in
> >> fit.mult.imput.  It seems that ols provides a variance/covariance
> >> matrix (or something) that fit.mult.impute can use.
> >
> >That works too.
> 
> That gives me what I get if I use lm and then recalculate the t
> values "by hand" from fmp$var.  Thus, ols seems like the way to
> go for now, if only to avoid additional calculations.
> 
> Jon
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help


---
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat




More information about the R-help mailing list