[R] Regarding randomForest regression

David Winsemius dwinsemius at comcast.net
Thu Mar 8 17:21:45 CET 2012


On Mar 8, 2012, at 5:10 AM, shameek ghosh wrote:

> Sir,
>     This query is related to randomForest regression using R.
>
>     I have a dataset called qsar.arff which I use as my training set  
> and
> then I run the following function -
>
>      
> rf=randomForest(x=train,y=trainy,xtest=train,ytest=trainy,ntree=500)
>
>    where train is a matrix of predictors without the column to be
> predicted(the target column), trainy is the target column.I feed the  
> same
> data for xtest and ytest too as shown.
>
>    On verifying I found, rf$mse[500] and rf$test$mse[500] are
> different(the r-squares are also different).The predicted values of  
> the
> training target column and testing target column are also different.
>
>   Should this happen , since I am using the training dataset as the
> testing dataset? I expected that the test and training predictions  
> would be
> the same.

My inference from its name  _random_Forest, was that it was _not_  
"deterministic forest".

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list