[R] memory limit

seanpor seanpor at acm.org
Wed Nov 26 14:45:29 CET 2008

Good afternoon,

The short answer is "yes", the long answer is "it depends".

It all depends on what you want to do with the data, I'm working with
dataframes of a couple of million lines, on this plain desktop machine and
for my purposes it works fine.  I read in text files, manipulate them,
convert them into dataframes, do some basic descriptive stats and tests on
them, a couple of columns at a time, all quick and simple in R.  There are
some libraries which are setup to handle very large datasets, e.g. biglm

If you're using algorithms which require vast quantities of memory, then as
the previous emails in this thread suggest, you might need R running on

If you're working with a problem which is "embarrassingly parallel"[2], then
there are a variety of solutions - if you're in between then the solutions
are much more data dependant.

the flip question: how long would it take you to get up and running with the
functionallity (tried and tested in R) you require if you're going to be
re-working things in C++?

I suggest that you have a look at R, possibly using a subset of your full
set to start with - you'll be amazed how quickly you can get up and running.

As suggested at the start of this email... "it depends"...

Best Regards,
Sean O'Riordain

[1] http://cran.r-project.org/web/packages/biglm/index.html
[2] http://en.wikipedia.org/wiki/Embarrassingly_parallel

iwalters wrote:
> I'm currently working with very large datasets that consist out of
> 1,000,000 + rows.  Is it at all possible to use R for datasets this size
> or should I rather consider C++/Java.

View this message in context: http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list