[R] problems reading a large dta dataset in R

Daniel Nordlund djnordlund at frontier.com
Tue Mar 6 03:25:50 CET 2012

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Francesco Sarracino
> Sent: Monday, March 05, 2012 4:00 PM
> To: r-help at r-project.org
> Subject: [R] problems reading a large dta dataset in R
> Dear R listers,
> I have a silly problem. I am trying to load a dta (Stata) file in R.
> The dta is about 650 MB and contains the integrated World Values
> Survey/ European Value Study data-set.
> My problem is that I don't manage to load the file. After almost 3
> hours after I issued the following command:
> data <-
> stata.get("/alter/FS/INT_WVS2008EVS/V02/data/integrated_values_surveys_198
> 1_2008.dta")
> I still don't have my data loaded.
> These are the libraries that I load automatically:
> library("foreign")
> library("Hmisc")
> library("MASS")
> library("lattice")
> library("arm")
> library("memisc")
> Moreover, my system becomes very slow and not responsive.
> I can't figure out what is going on. I am sorry, but I can't provide
> you with a working example.
> I also tried converting the dta into csv or txt files. Unfortunately,
> the size of the file increases by 4 times (~2.5Gb) and no improvement
> on the side of R.
> Here you are my specs:
> Ubuntu Linux 11.10 x86_64-pc-linux-gnu (64-bit)
> Intel Core i7, 4 GB RAM, 367 GB Free HD, 8 GB swap memory
> R:
> R version 2.14.2 (2012-02-29)
> Can you please help me figuring out what's wrong? I think it's
> impossible that R can't handle files of similar sizes.
> Thanks a lot for your kind help,
> f.

You are correct, R can handle files that size with no problem (given sufficient memory).  You say you can't provide a working example.  Well, it will be difficult for anyone to give you a working solution.  We don't have access to the file you are trying to read so we can't possibly know if the problem is with the file, or your system, or ... .


Daniel Nordlund
Bothell, WA USA

More information about the R-help mailing list