memory once again

Dimitri Joe dimitrijoe at gmail.com
Fri Mar 3 20:28:15 CET 2006


Dear all,

A few weeks ago, I asked this list why small Stata files became huge R 
files. Thomas Lumley said it was because "Stata uses single-precision 
floating point by default and can use 1-byte and 2-byte integers. R uses 
double precision floating point and four-byte integers." And it seemed I 
couldn't do anythig about it.

Is it true? I mean, isn't there a (more or less simple) way to change 
how R stores data (maybe by changing the source code and compiling it)?

The reason why I insist in this point is because I am trying to work 
with a data frame with more than 820.000 observations and 80 variables. 
The Stata file has 150Mb. With my Pentiun IV 2GHz and 1G RAM, Windows 
XP, I could't do the import using the read.dta() function from package 
foreign. With Stat Transfer I managed to convert the Stata file to a S 
file of 350Mb, but my machine still didn't manage to import it using 
read.S().

I even tried to "increase" my memory by memory.limit(4000), but it still 
didn't work.

Regardless of the answer to my question, I'd appreciate to hear about 
your experience/suggestions in working with big files in R.

Thank you for youR-Help,

Dimitri Szerman



More information about the R-help mailing list