[R] getting p-value and standard error in PLS

Bjørn-Helge Mevik b.h.mevik at usit.uio.no
Wed Oct 19 09:45:18 CEST 2011


arunkumar1111 <akpbond007 at gmail.com> writes:

> How to get p-value and the standard error in PLS

There is (to my knowledge) no theory able to calculate p-values for the
regression coefficients in PLS regression.  Most practicioners use
cross-validation to estimate the Root Mean Squared Error (RMSEP) and use
that as a measure of the quality of the fit.  PLS regression is
typically used when you have many (hundreds, thousands, tens of
thousands) of predictors, where individual p-values are not very useful.

The pls package does implement the jackknife to estimate the
variance/standard error of the regression coefficients.  There is even a
function to calculate p-values from that, but please _do_ read the
warning in the documentation: the distribution of the "t values" used in
the test is _unknown_.  See the example in ?jack.test for how to use the
jackknife.

> I have used the following function to calculate PLS
>
> fit1 <- mvr(formula=Y~X1+X2+X3+X4, data=Dataset, comp=4)

>From a previous message on this list, I see that each of these predictor
terms (X1, ...) is a vector.  Thus you have only 4 predictor variables,
so it would probably be better to use Ordinary Least Squares (OLS)
regression (the lm() function in R).  There you get p-values automatically.

Furthermore, a PLS regression with the same number of components as
predictor variables is equivalent to OLS, so there seems no reason to
use PLS at all in your case.

-- 
Cheers,
Bjørn-Helge Mevik



More information about the R-help mailing list