[R] One critical question in R

Nordlund, Dan (DSHS/RDA) NordlDJ at dshs.wa.gov
Tue Aug 4 17:42:08 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Hyo Karen Lee
> Sent: Tuesday, August 04, 2009 8:21 AM
> To: r-help at r-project.org
> Subject: [R] One critical question in R
> 
> Hi,
> I have one critical question in using R.
> I am currently working on some research which involves huge amounts
> of data(it is about 15GB).
> I am trying to use R in this research rather than using SAS or STATA.
> (The company where I am working right now, is trying to switch SAS/STATA to
> R)
> 
> As far as I know, the memory limit in R is 4GB;

The memory limit depends on your hardware and OS which you haven't told us about.  With Linux and a 64-bit computer the limit MUCH higher.  With 32-bit MS Windows OS you won't likely get even 3GB. 

> However, I believe that there are ways to handle the large dataset.

You can use a database program like MySQL for example.  If you have files that are on the order of 15GB in size, I don't thinlk you are going to have much success cleaning the data use R (well I know I wouldn't, but maybe one of the experts here can help you out).  You may be able to use the biglm package for analuses, or read in just the data you need for your regressions.  If you more help you will need to tell us more about what your data is like, with more specifics about what your analyses will look like.  

> Most of my works in R would be something like cleaning the data or running a
> simple regression(OLS/Logit) though.
> 
> The whole company relies on me when it comes to R.
> Please teach me how to deal with large data in R.
> If you can, please give me a response very soon.
> Thank you very much.
> 
> Regards,
> Hyo
> 

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 




More information about the R-help mailing list