[R] data set size question

Gabor Grothendieck ggrothendieck at gmail.com
Wed Jun 14 06:01:18 CEST 2006


The restriction is that objects are kept in memory
so if you have sufficient memory and your OS lets you
access it then you should be ok.  S-Plus is a commercial
package similar to R but stores its objects in files and can handle
larger data sets if you run into trouble.

Given that R is free and once downloaded can be
installed on Windows in a minute or so (I assume its
just as easy on other OSes) just install it and generate
some test data and see if you have any problems,
e.g.  I had no trouble running the following on my PC:

n <- 100000
p <- 20
x <- matrix(rnorm(n * p), n)
colnames(x) <- letters[1:p]
# regress column a against the rest
x.lm <- lm(a ~., as.data.frame(x))
plot(x.lm)  # click mouse to advance to successive plots
summary(x.lm)

On 6/13/06, Carl Hauser <Carl.Hauser at nwea.org> wrote:
> Hi there,
>
> I'm very new to R and am only in the beginning stages of investigating
> it for possible use. A document by John Maindonald at the r-project
> website entitled "Using R for Data Analysis and Graphics: Introduction,
> Code and Commentary" contains the following paragraph, "The R system may
> struggle to handle very large data sets. Depending on available computer
> memory, the processing of a data set containing one hundred thousand
> observations and perhaps twenty variables may press the limits of what R
> can easily handle". This document was written in 2004.
>
> My questions are:
>
> Is this still the case? If so, has anyone come up with creative
> solutions to mitigate these limitations? If you work with large data
> sets in R, what have your experiences been?
>
> >From what I've seen so far, R seems to have enormous potential and
> capabilities. I routinely work with data sets of several hundred
> thousand to several million. It would be unfortunate if such potential
> and capabilities were not realized because of (effective) data set size
> limitations.
>
> Please tell me it ain't so.
>
> Thanks for any help or suggestions.
>
> Carl
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list