R-beta: read.table and large datasets

Thomas Lumley thomas at biostat.washington.edu
Mon Mar 9 20:11:25 CET 1998


On Mon, 9 Mar 1998, Rick White wrote:

> I find that read.table cannot handle large datasets. Suppose data is a
> 40000 x 6 dataset
> 
> R -v 100
> 
> x_read.table("data")  gives
> Error: memory exhausted
> but
> x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6))
> works fine.

You need to increase the number of cons cells as well as the vector heap
size

eg

R -v 40 -n 1000000

to allocate 1000000 cons cells instead of the standard 200000.

To see what sort of memory you are running out of, use gcinfo(T), which
tells R to report the memory status after each garbage collection. 


Thomas Lumley
------------------------
Biostatistics		
Uni of Washington	
Box 357232		
Seattle WA 98195-7232	
------------------------


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list