[Rd] data frame subset patch, take 2

Marcus G. Daniels mgd at santafe.edu
Tue Dec 12 18:32:14 CET 2006


Hi Martin,

Conventions for optimizing away long, useless row name vector sound very 
useful. Nice timings too!
I've noticed that before, and not been sure quite what to do.   e.g. the 
hdf5 module just gives up past a certain threshold as the long vectors 
cause performance problems and HDF5 doesn't allow giant attributes 
anyway.  The common case for me, is no row names except numbers.
> Note however that some of these changes are backward
> incompatible. I do hope that the changes gaining efficiency
> for such large data frames are worth some adaption of
> current/old R source code..
>   
On numerous occasions I've used 64 bit Altix systems, e.g. having a 
terabyte of RAM, for loading and preprocessing data, just so I can zip 
around in the image once it is done (either on that system or 
another).    R works great for big datasets, even though it has a few of 
these rough edges..

Marcus



More information about the R-devel mailing list