[R] cross-validation in plsr package

Max Kuhn mxkuhn at gmail.com
Mon Feb 22 14:32:23 CET 2010


> The cross-validation in the pls package does not propose a number of
> factors as optimum, you have to select this yourself.  (The reason for
> this is that there is AFAIK no theoretically founded and widely accepted
> way of doing this automatically.  I'd be happy to learn otherwise.)

The caret package has a wrapper for pls and multiple resampling
methods (cv, bootstrap, repeated test/train splits etc).

There are a few modules that can be used for automatically determining
the optimal number of components. I agree that there is no uniformly
best technique. The only thing that I know of that is widely accepted
is the 1 stardard error rule in CART. In this case, that would mean
that you find the value of ncomp with the smallest error and choose
the final ncomp value based of the smallest value within one standard
error of the optimal. caret can do this or use any other rule that you
think is appropriate.

Thanks,

Max



More information about the R-help mailing list