[R] randomForest

Uwe Ligges ligges at statistik.tu-dortmund.de
Fri Mar 20 19:03:06 CET 2009



Anirudh Kondaveeti wrote:
> To be more clear,
> 
> My data set contains two classes.. Class 1 and Class 2
> Class 1 has original data with 300 rows
> Class 2 is randomly generated data with 1500 rows.
> 
> I want to sample a new data set with
> Class 1 - all the rows
> Class 2 - only 300 rows out of 1500 rows
> 
> and then use it in random forest with 500 trees.
> 
> Also the Class 2 should have different 300 rows for different trees in the
> forest. Thanks!


Ah, in that case (stratified sampling) combine arguments "strata" and 
"sampsize", in principle, but you cannot select ALL rows of one class: 
you somehow ignore one of the main ideas of randomForests to bootstrap 
observations - and randomForest will certainly bootstrap for you.

Uwe Ligges



> Anirudh Kondaveeti
> ----------------------------
> 
> 
> On Fri, Mar 20, 2009 at 1:45 PM, Anirudh Kondaveeti <
> anirudh.kondaveeti at gmail.com> wrote:
> 
>> sampsize uses the same sample for all the trees in the random Forest.
>>
>> But I want to use different sample for each tree of the 500 trees in the
>> random Forest. Thanks!
>>
>>
>> Anirudh Kondaveeti
>> ----------------------------
>>
>>
>> 2009/3/20 Uwe Ligges <ligges at statistik.tu-dortmund.de>
>>
>>
>>> Anirudh Kondaveeti wrote:
>>>
>>>> Hi!
>>>>
>>>> I am dealing with random forest using R.
>>>>
>>>> Is there a way to sample a fixed no.of rows from a dataset for use with
>>>> different trees in random Forest.
>>>> To be more clear, my data set contains 1500 rows, and I am growing 500
>>>> trees
>>>> in Random Forest
>>>> Is it possible to sample only 500 rows of data from the data set and use
>>>> it
>>>> for different trees in the forest. I mean each tree of the forest should
>>>> use
>>>> a different 500 rows from the data set.
>>>>
>>>
>>> See ?randomForest and the argument sampsize.
>>>
>>> Uwe Ligges
>>>
>>>
>>>
>>>
>>>> Thanks in advance!
>>>>
>>>> Anirudh Kondaveeti
>>>> ----------------------------
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>




More information about the R-help mailing list