[R] Normalization and missing values

Thu Apr 14 13:41:58 CEST 2005

On 04/13/05 21:05, Chris Bergstresser wrote:
     This article is great; thanks for providing it.  The authors
 recommend either using "ML Estimation" or "Multiple Imputation" to fill
 in the missing data.  They don't talk much about which is better for
 certain situations, however.

Multiple imputation is good when you want to make statistical
inferences.  It is what aregImpute() is good for.

I used transcan() for a situation that did not involve inference:
Our graduate admissions committee of 5 rates applicants, and the
members of the committee differ somewhat in mean and variance,
and sometimes a member is out of the room when an applicant is
rated.  So I attempt to mimic what the member will do anyway,
which is to conform and adjust:

s.m <- as.matrix(students[,4:8]) # ratings, NA when missing
s.imp <- transcan(s.m,asis="*",data=s.m,imputed=T,long=T,pl=F)
s.na <- is.na(s.m) # which ratings are imputed
s.m[which(s.na)] <- unlist(s.imp$imputed)
students[,4:8] <- s.m

The last 3 lines seem like a kludge to me, but I couldn't find
any other way in the time I had, and this works.  This does not
involve multiple imputation.  I guess it would also be OK for
inference if there weren't very many missing data, but don't take 
my word for it.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
R search page: http://finzi.psych.upenn.edu/