[R] tree model large dataset -error

Prof Brian D Ripley ripley at stats.ox.ac.uk
Mon Sep 17 10:33:41 CEST 2001

On Mon, 17 Sep 2001, srinivasa raghavan wrote:

> Hi R-users,
>      I was trying to do CART using the tree package
> for a dataset with 6 groups and 1000 predictor
> variables, number of observations 5000. The following
> error message is generated

Please, CART is a trademark (so don't misuse it), and what tree does is not
CART anyway. I suggest you use rpart (a closer approximation to CART).

I don't think this is sensible statistically.  With 1000 predictors
to choose from and only 5000 observations, you will just suffer from
data-dredging.  You must know somethign about the 1000 predictors,
so used structured subgroups.

However, let's do some calculations.  R stores data in memory.  You have 5
million data items, and they will be stored in doubles, so your dataset is
ca 40Mb.  You will need at least a couple more copies.  So you need more
than 128Mb.

> Error:cannot allocate vector of size 3910 kb
> In addition: warning message:
> Reached total allocation of 125Mb:
> I am using R 1.3 for Windows 98 in a pII with 128
>     If RAM should be increased what is the ideal
> system configuration for processing large datasets say
> more than 500 MB size to perform Multivariate and
> Exploratory data analysis( I maynot be able to switch
> to mainframe or supercomputers)

(There's a lot between Windows 98 and mainframes or supercomputers.
You can let R use virtual memory: see the rw-FAQ, but Windows 98
is not good at this.  Linux runs R well on machines with 1Gb RAM:
Windows 2000 also runs R well but may use memory less efficiently.)

> any suggestion will be highly appreciated

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list