[R] Multiple imputation using mice

Florian Weiler fweiler08 at jhubc.it
Wed Mar 7 12:42:22 CET 2012

Dear all,

I am trying to impute data for a range of variables in my data set, of which
unfortunately most variables have missing values, and some have quite a few.
So I set up the predictor matrix to exclude certain variables (setting the
relevant elements to zero) and then I run the imputation. This works fine if
I use predictive mean matching for the continous variables in the data set. 
When I resort to "norm" instead of pmm, the results look generally fine as
well. However, for one variable I get some huge out of range values. Here
are summary statistics before and after imputation:

> summary(aux$emitters)  #original data
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
  0.00219   2.10200   7.33800  17.87000  23.15000 136.20000  52.00000 

> summary(complete(imp2)$emitters)  #imputation 1
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-68.920   2.062  10.000  19.980  32.980 136.200 

> summary(complete(imp2,2)$emitters) #imputation 2 (looks better)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-30.650   1.848   8.808  20.480  32.980 136.200 


Now my question is, in such cases, would it be better to use pmm for this
variable instead, or should I instead use the squeeze() function in MICE? I
read a paper explaining MICE:
http://www.stefvanbuuren.nl/publications/MICE%20in%20R%20-%20Draft.pdf, but
I am still unsure how to proceed. I would be really grateful for some
advise, thanks!

View this message in context: http://r.789695.n4.nabble.com/Multiple-imputation-using-mice-tp4452986p4452986.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list