[R] NA matrix operation

Joshua Wiley jwiley.psych at gmail.com
Thu Dec 16 00:49:22 CET 2010


Hi Patrik,

If speed is an issue, an easy way to go a bit faster using Dr.
Spector's code is:

wh =  which(is.na(mat), arr.ind=TRUE)
mat[wh] = colMeans(mat, na.rm=TRUE)[wh[,2]]

using that method, I can replace the NA values with column means on a
30000 x 10000 matrix with ~ 1% of values missing in just under 4
seconds.  Any larger matrix and my system starts complaining about
memory.

Cheers,

Josh

On Wed, Dec 15, 2010 at 2:17 PM, Phil Spector <spector at stat.berkeley.edu> wrote:
> Patrik -
>    I don't know if it's the fastest, but, assuming your Matrix is called
> mat, this seems to work fairly quickly:
>
> wh =  which(is.na(mat),arr.ind=TRUE)
> mat[wh] = apply(mat,2,mean,na.rm=TRUE)[wh[,2]]
>
>                                        - Phil Spector
>                                         Statistical Computing Facility
>                                         Department of Statistics
>                                         UC Berkeley
>                                         spector at stat.berkeley.edu
>
>
> On Wed, 15 Dec 2010, patrik.waldmann at djingis.se wrote:
>
>>
>>
>>        Dear All,
>>
>>        ??
>>
>>        does anyone know which is the fastest way to replace NA values in a
>> Matrix by their column mean?
>>
>>        library(Matrix)
>> mat
>>
>> Links:
>> ------
>> [1] http://www.ownit.se
>>
>>
>>        [[alternative HTML version deleted]]
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list