[R] pls regression - optimal number of LVs

Dowkiw, Arnaud Arnaud.Dowkiw at dpi.qld.gov.au
Thu Jul 24 05:02:02 CEST 2003


Dear R-helpers,

I have performed a PLS regression with the mvr function from the pls.pcr package an I have 2 questions :
1- do you know if mvr automatically centers the data ? It seems to me that it does so...
2- why in  the situation below does the output say that the optimal number of latent variables is 4 ? In my humble opinion, it is 2 because the RMS increases and the R2 decreases when 3 LVs are considered :
> summary(maturityCondor.raw.mvr)
Data:   X dimension: 8 1050 
        Y dimension: 8 1
Method: SIMPLS
Number of latent variables considered: 1-7 


TRAINING:
RMS table:
           [,1]
1 LV's 1.23e+01
2 LV's 6.79e+00
3 LV's 5.00e+00
4 LV's 2.17e+00
5 LV's 1.93e+00
6 LV's 7.79e-01
7 LV's 1.01e-09

Cumulative fraction of variance explained:
           X     Y
1 LV's 0.848 0.499
2 LV's 0.930 0.846
3 LV's 0.979 0.917
4 LV's 0.992 0.984
5 LV's 0.999 0.988
6 LV's 1.000 0.998
7 LV's 1.000 1.000


VALIDATION
Optimal number of latent variables: 4

RMS table (10-fold crossvalidation):
        [,1]
1 LV's 16.21
2 LV's 12.15
3 LV's 13.81
4 LV's  6.68
5 LV's  6.38
6 LV's  5.91
7 LV's 13.38

Coefficient of multiple determination (R2):
       [,1]
1 LV's 0.20
2 LV's 0.51
3 LV's 0.41
4 LV's 0.88
5 LV's 0.87
6 LV's 0.90
7 LV's 0.77

Thanks for your help,

Arnaud


*************************
Arnaud DOWKIW
Department of Primary Industries
J. Bjelke-Petersen Research Station
KINGAROY, QLD 4610
Australia
T : + 61 7 41 600 700
T : + 61 7 41 600 728 (direct)
F : + 61 7 41 600 760
**************************
 

********************************DISCLAIMER******************...{{dropped}}




More information about the R-help mailing list