[R] errorest slow

Uwe Ligges ligges at statistik.tu-dortmund.de
Thu Jun 20 19:53:38 CEST 2013



On 20.06.2013 16:46, David martin wrote:
> Hi ,
> When using errorest on a large dataset (12000 variables) it performs
> very slow. By looking at the randomforest package it says that for
> largedatasets the use of the formula is discouraged.
>
> So it's better to use the x and y terms as the example below:
> rf<-randomForest(x=df[trainindices,-1],y=df[trainindices,1],xtest=df[testindices,-1],ytest=df[testindices,1],
> do.trace=5, ntree=500)
>
> Would it be possible to modify errorest so that it uses x and y rather
> than formula. I think that would increase speed on large datasets.
>
> errorest(type~.,data=mydate, model=randomForest,mtry=2)#will perform slow
> errorest(x=type,y=variables,data=mydate,
> model=randomForest,mtry=2)#would perform faster if implemented

Talk to the maintainer of the package you found errorest() in?

Best,
Uwe Ligges

> thanks,
> david
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list