[R] Random Forest - Strata

Coll gbcoll2 at gmail.com
Wed Jul 28 18:42:37 CEST 2010


Max, 

Thanks. Yes what you said is exactly I am looking for, i.e. the first tree
fits using data from sites A&B, then predicts on C (and so on).

Does that means if I :
1. pass this list as index into trainControl
> tmpSiteList
[[1]]
[1] 1 2 3 4 5 6 7

[[2]]
[1]  1  2  3  8  9 10

[[3]]
[1]  4  5  6  7  8  9 10

AND

2. use other "methods" in the trainControl() 

then I would get the RF to be built and tested in the above way?


I had tried other "methods" in the trainControl (had tried root, cv), but
seems in the final built RF, the "rf.obj$finalModel$inbag" still does not
match those in the "index"...my understanding of "rf.obj$finalModel$inbag"
is that it should show which row of sample that had gone into training of
that particular tree, which in essence should match the "index" argument
that we had passed into "trainControl"...may be my understanding of what
this "rf.obj$finalModel$inbag" would show is wrong?

I had not look into the estimates yet, what I am looking is just to make
sure in each of the tree iteration, the "training sites" data does go into
the training, and the "hold out sites" data would be used for testing in
that tree iteration.

Welcome any thoughts/ideas. Again really appreciates your patience and help
on this.

Regards,
Coll

-- 
View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2305269.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list