[R] cor() alternative for huge data set

Jyotasana Gulati jgulati at ice.mpg.de
Wed Sep 29 22:27:55 CEST 2010


Hi, 

I am have a data set of around 43000 probes(rows), and have to calculate correlation matrix. When I run cor function in R, its throwing an error message of RAM shortage which was obvious for such huge number of rows.  I am not getting a logical way to cut off this huge number of entities, is there an alternative to pearson correlation or with other dist() methods calculation(euclidean) that can be run on such a huge data set?? 
Every help will be appreciated.

Regards
..
JG



More information about the R-help mailing list