[R] Stepwise Regression and PLS

Jinsong Zhao jinsong_zh at yahoo.com
Sun Feb 1 20:09:28 CET 2004

Dear all,

I am a newcomer to R. I intend to using R to do
stepwise regression and PLS with a data set (a 55x20
matrix, with one dependent and 19 independent
variable). Based on the same data set, I have done the
same work using SPSS and SAS. However, there is much
difference between the results obtained by R and SPSS
or SAS.

In the case of stepwise, SPSS gave out a model with 4
independent variable, but with step(), R gave out a
model with 10 and much higher R2. Furthermore,
regsubsets() also indicate the 10 variable is one of
the best regression subset. How to explain this
difference? And in the case of my data set, how many
variables that enter the model would be reasonable?

In the case of PLS, the results of mvr function of
pls.pcr package is also different with that of SAS.
Although the number of optimum latent variables is
same, the difference between R2 is much large. Why?

Any comment and suggestion is very appreciated. Thanks
in advance!

Best wishes,

Jinsong Zhao

(Mr.) Jinsong Zhao
Ph.D. Candidate
School of the Environment
Nanjing University
22 Hankou Road, Nanjing 210093
P.R. China
E-mail: jinsong_zh at yahoo.com

More information about the R-help mailing list