[R] cross validation in random forest using rfcv functin

Elahe chalabi chalabi.elahe at yahoo.de
Wed Aug 23 14:38:42 CEST 2017


Hi all,


I would like to do cross validation in random forest using rfcv function. As the documentation for this package says:


rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...)


however I don't know how to build trianx and trainy for my data set, and I could not understand the way trainx is built in the package documentation example for iris data set.

Here is my data set and I want to do cross validation to see accuracy in classifying Alzheimer and Control Group:


str(data)

'data.frame':    499 obs. of  606 variables:

$ Gender        : int  0 0 0 0 0 1 1 1 1 1 ...

$ NumOfWords    : num  157 111 163 176 100 124 201 100 76 101

$ NumofLivings  : int  6 6 9 4 3 5 3 3 4 3 ...

$ NumofStopWords: num  77 45 87 91 46 64 104 37 32 41 ...

.

.

$ Group         : Factor w/ 2 levels "Alzheimer","Control","Control"..:


So basically trainy should be data$Group but how about trainx? Could anyone help me in this?



Thanks for any help!

Elahe



More information about the R-help mailing list