[R] how to replace NA with a specific score that is dependant on another indicator variable

Hadley Wickham hadley at rice.edu
Wed Sep 1 17:58:56 CEST 2010


> first ddply result did I see that some sort of misregistration had occurred;
> Better with:
>
> res <-ddply(egraw2, .(category), .fun=function(df) {
>               sapply(df,
>                    function(x) {mnx <- mean(x, na.rm=TRUE);
>                                 sapply(x, function(z) if
> (is.na(z)){mnx}else{z})
>                                }
>                       )                      }          )

It's a little simpler with the built in numcolwise function:

egraw2 <- data.frame(category=rep(1:4, 4),
  var1=sample(c(1:3, NA,NA), 16, replace =TRUE),
  var2=sample(c(5:10, NA,NA), 16, replace =TRUE),
  var3=sample(c(15:20, NA,NA), 16, replace =TRUE) )

na.mean <- function(x) ifelse(!is.na(x), x, mean(x, na.rm = TRUE))
ddply(egraw2, "category", numcolwise(na.mean))

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list