[R] problems reading a large dta dataset in R

Francesco Sarracino f.sarracino at gmail.com
Tue Mar 6 00:59:45 CET 2012


Dear R listers,

I have a silly problem. I am trying to load a dta (Stata) file in R.
The dta is about 650 MB and contains the integrated World Values
Survey/ European Value Study data-set.
My problem is that I don't manage to load the file. After almost 3
hours after I issued the following command:

data <- stata.get("/alter/FS/INT_WVS2008EVS/V02/data/integrated_values_surveys_1981_2008.dta")

I still don't have my data loaded.
These are the libraries that I load automatically:
library("foreign")
library("Hmisc")
library("MASS")
library("lattice")
library("arm")
library("memisc")

Moreover, my system becomes very slow and not responsive.
I can't figure out what is going on. I am sorry, but I can't provide
you with a working example.
I also tried converting the dta into csv or txt files. Unfortunately,
the size of the file increases by 4 times (~2.5Gb) and no improvement
on the side of R.
Here you are my specs:
Ubuntu Linux 11.10 x86_64-pc-linux-gnu (64-bit)
Intel Core i7, 4 GB RAM, 367 GB Free HD, 8 GB swap memory
R:
R version 2.14.2 (2012-02-29)

Can you please help me figuring out what's wrong? I think it's
impossible that R can't handle files of similar sizes.
Thanks a lot for your kind help,
f.



More information about the R-help mailing list