[R] How to write a loop in R to select multiple regression model and validate it ?

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Wed Jun 5 02:12:00 CEST 2013

This doesn't look like a task you have acquired through a real-life problem... it looks like homework. There is a stated no-homework policy in the Posting Guide (please read it), since you should be using the resources provided along with your educational environment (teaching assistants, tutors, office hours...), and we don't know whether the help we provide would be considered "cheating".
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
Sent from my phone. Please excuse my brevity.

beginner <paxkn at nottingham.ac.uk> wrote:

>I would like to run a loop in R. I have never done this before, so I
>would be
>very grateful for your help !
>1. I have a sample set: 25 objects. I would like to draw 1 object from
>and use it as a test set for my future external validation. The
>remaining 24
>objects I would like to use as a training set (to select a model). I
>like to repeat this process until all 25 objects are used as a test
>2. For each of the training sets I would like to run the following
>forward <- regsubsets(Y ~.,data = training, method = "forward",
>backward <- regsubsets(Y ~.,data = training, method = "backward",
>stepwise <- regsubsets(Y ~., data = training, method = "seqrep",
>exhaustive <- regsubsets(Y ~.,data = training, method = "forward",
>I would like R programme to select the best model (with the highest
>R2) using each of the selection methods, so there are 4 final best
>(e.g. the best model selected with forward selection, the best model
>selected with backward selection and so on...). 
>Afterwards I would like to perform internal cross validation of all 4
>selected models and choose 1 out of 4 which has the lowest average mean
>squared error (MSE). I used to do it using the code below:
>val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X3))
>val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X4))
>val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X3+X4+X5))
>val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X4+X5+X7))
>For the best selected model (the lowest MSE) I would like to perform an
>external validation on 1 object left on the site at the beginning of
>study (please refer to point 1.).
>3. And loop again using different training and test set ....
>I hope that you could help me with this. 
>If you have any suggestions how to select the best model and perform
>validation more efficiently, I would be happy to hear about that.
>Thank you !
>View this message in context:
>Sent from the R help mailing list archive at Nabble.com.
>R-help at r-project.org mailing list
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list