[R] Exchange NAs for mean

Paul Hiemstra p.hiemstra at geo.uu.nl
Thu Dec 17 15:57:49 CET 2009


Barry Rowlingson wrote:
> 2009/12/17 Joel Fürstenberg-Hägg <joel_furstenberg_hagg at hotmail.com>:
>   
>> Hi all,
>>
>>
>>
>> I'm have a matrix (X) with observations as rows and parameters as columns. I'm trying to exchange all missing values in a column by the column mean using the code below, but so far, nothing happens with the NAs... Can anyone see where the problem is?
>>
>>
>>
>> N<-nrow(X) # Calculate number of rows = 108
>> p<-ncol(X) # Calculate number of columns = 88
>>
>>
>> # Replace by columnwise mean
>> for (i in colnames(X)) # Do for all columns in the matrix
>> {
>>   for (j in rownames(X)) # Go through all rows
>>   {
>>      if(is.na(X[j,i])) # Search for missing value in the given position
>>      {
>>         X[j,i]=mean(X[1:p, i]) # Change missing value to the mean of the column
>>      }
>>   }
>> }
>>
>>     
>
>
>  mean(anything with an NA in it) == NA. You want mean(X[1:p,i],na.rm=TRUE)
>
>  > mean(c(1,2,3,NA,4))
>  [1] NA
>  > mean(c(1,2,3,NA,4),na.rm=TRUE)
>  [1] 2.5
>
> I'll leave it to someone else to show you how to speed this code up by
> removing the loops...
>
> Barry
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   
Hi,

To replace all the NA's in the columns by the column mean do:

# Make example set
X = matrix(runif(25), 5, 5)
# Add some NA's
X[X>0.6] = NA

# Use an apply function, is shorthand for a loop
# Loops over the columns
X2 = apply(X,2,function(column) {
        column[is.na(column)]  = mean(column, na.rm = TRUE)
        return(column)
    })

X
X2

Is this ok barry :).

cheers,
Paul

-- 
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul




More information about the R-help mailing list