[R] errorest slow

David martin vilanew at gmail.com
Thu Jun 20 16:46:23 CEST 2013


Hi ,
When using errorest on a large dataset (12000 variables) it performs 
very slow. By looking at the randomforest package it says that for 
largedatasets the use of the formula is discouraged.

So it's better to use the x and y terms as the example below:
rf<-randomForest(x=df[trainindices,-1],y=df[trainindices,1],xtest=df[testindices,-1],ytest=df[testindices,1], 
do.trace=5, ntree=500)

Would it be possible to modify errorest so that it uses x and y rather 
than formula. I think that would increase speed on large datasets.

errorest(type~.,data=mydate, model=randomForest,mtry=2)#will perform slow
errorest(x=type,y=variables,data=mydate, 
model=randomForest,mtry=2)#would perform faster if implemented

thanks,
david



More information about the R-help mailing list