[R] memory once again

Berton Gunter gunter.berton at gene.com
Fri Mar 3 20:42:07 CET 2006


What you propose is not really a solution, as even if your data set didn't
break the modified precision, another would. And of course, there is a price
to be paid for reduced numerical precision.

The real issue is that R's current design is incapable of dealing with data
sets larger than what can fit in physical memory (expert
comment/correction?). My understanding is that there is no way to change
this without a fundamental redesign of R. This means that you must either
live with R's limitations or use other software for "large" data sets.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dimitri Joe
> Sent: Friday, March 03, 2006 11:28 AM
> To: R-Help
> Subject: [R] memory once again
> 
> Dear all,
> 
> A few weeks ago, I asked this list why small Stata files 
> became huge R 
> files. Thomas Lumley said it was because "Stata uses single-precision 
> floating point by default and can use 1-byte and 2-byte 
> integers. R uses 
> double precision floating point and four-byte integers." And 
> it seemed I 
> couldn't do anythig about it.
> 
> Is it true? I mean, isn't there a (more or less simple) way to change 
> how R stores data (maybe by changing the source code and 
> compiling it)?
> 
> The reason why I insist in this point is because I am trying to work 
> with a data frame with more than 820.000 observations and 80 
> variables. 
> The Stata file has 150Mb. With my Pentiun IV 2GHz and 1G RAM, Windows 
> XP, I could't do the import using the read.dta() function 
> from package 
> foreign. With Stat Transfer I managed to convert the Stata 
> file to a S 
> file of 350Mb, but my machine still didn't manage to import it using 
> read.S().
> 
> I even tried to "increase" my memory by memory.limit(4000), 
> but it still 
> didn't work.
> 
> Regardless of the answer to my question, I'd appreciate to hear about 
> your experience/suggestions in working with big files in R.
> 
> Thank you for youR-Help,
> 
> Dimitri Szerman
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 



More information about the R-help mailing list