[R] gbm

Weiwei Shi helprhelp at yahoo.com
Thu Jan 13 01:11:58 CET 2005


Hi, there:
Thanks a lot for all people' prompt replies.

In detail, I am facing a huge amount of data: over
10,000 and 400 vars. This project is very challenging
and interesting to me. I tried rpart which gives me
some promising results but not good enough. So I am
trying randomForest and gbm now. 

My plan of using gbm is like this:
rt<-rpart(...)
gbm(formula(rt)...)

Does this work? (My first question)

My another CONCERN FOR GBM is the scalability since I
realize R seems to load all the data into memory. (My
second question)

But I believe the idea above will run very slowly. (I
think I might try TreeNet, though I don't like it
since it is commercial.). BTW, sampling might be a
good idea, but it does not seem a good idea for my
project from previous experiments.

I read some reference mentioned earlier by helpers
before I sent my first email. But I still appreciate
any helps. You guys are so nice!

BTW, gbm means gradient boosting modeling :)

Ed




More information about the R-help mailing list