[R] Recursive partitioning algorithms in R vs. alia

Wensui Liu liuwensui at gmail.com
Fri Jun 19 23:09:54 CEST 2009


well, how difficult to code random forest with sas macro + proc split?
if you are lack of sas programming skill, then you are correct that
you have to wait for 8 years :-)
i don't know how much sas experience you have. as far as i know, both
bagging and boosting have been implemented in sas em for a while,
together with other cut-edge modeling tools such as svm / nnet.


On Fri, Jun 19, 2009 at 4:18 PM, Tobias
Verbeke<tobias.verbeke at openanalytics.be> wrote:
> Wensui Liu wrote:
>
>> in terms of the richness of features and ability to handle large
>> data(which is normal in bank), SAS EM should be on top of others.
>
> Should be ? That is not at all my experience.
> SAS EM is very much lagging behind current
> research. You will find variants of random forests
> in R that will not be in SAS for the next 8 years,
> to give just one example.
>
>> however, it is not cheap.
>> in terms of algorithm, split procedure in sas em can do
>> chaid/cart/c4.5, if i remember correctly.
>
> These are techniques of the 80s and 90s
> (which proves my point). CART is in rpart and
> an implementation of C4.5 can be accessed
> through RWeka. For the oldest one (CHAID, 1980),
> there might be an implementation soon:
>
> http://r-forge.r-project.org/projects/chaid/
>
> but again there have been quite some improvements
> in the last decade as well:
>
> http://cran.r-project.org/web/views/MachineLearning.html
>
> HTH,
> Tobias
>
>> On Fri, Jun 19, 2009 at 2:35 PM, Carlos J. Gil
>> Bellosta<cgb at datanalytics.com> wrote:
>>>
>>> Dear R-helpers,
>>>
>>> I had a conversation with a guy working in a "business intelligence"
>>> department at a major Spanish bank. They rely on recursive partitioning
>>> methods to rank customers according to certain criteria.
>>>
>>> They use both SAS EM and Salford Systems' CART. I have used package R
>>> part in the past, but I could not provide any kind of feature comparison
>>> or the like as I have no access to any installation of the first two
>>> proprietary products.
>>>
>>> Has anybody experience with them? Is there any public benchmark
>>> available? Is there any very good --although solely technical-- reason
>>> to pay hefty software licences? How would the algorithms implemented in
>>> rpart compare to those in SAS and/or CART?
>>>
>>> Best regards,
>>>
>>> Carlos J. Gil Bellosta
>>> http://www.datanalytics.com
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
>



-- 
==============================
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller




More information about the R-help mailing list