[R] automating sequence of multinomial regressions

Gabor Grothendieck ggrothendieck at myway.com
Wed Jul 28 07:06:30 CEST 2004


Daniel <spamiam <at> aroint.org> writes:

> 
> Disclaimer first: I only heard about R fairly recently, so I apologize if
> this is either a simple or impossible request, but R looked like it
> might be a good framework for this sort of thing...
> 
> Is it possible to write a script to run stepwise multinomial regressions
> on many *dependent* variables, and then compare results to a validation
> data set (e.g., Chow test)? Essentially, automate the process of finding
> best predictive model using a host of dependent and independent variables.
> 
> I have a fairly short timeframe to work on this, so if someone is
> willing to help me in the next couple of days, I would be most
> appreciative. (And there might even be a hefty sum of cash involved!)

Setting aside the basic overfitting problems, the following does
a stepwise regression on each of 10 dependent variables using
the first 100 rows of birthwt.  For the result of each of these
10 it then calculates the number of correct predictions using
the remaining rows.


require(nnet)
require(MASS)

# use birthwt data set and generate random matrix whose 10 cols are dep vars
data(birthwt)
set.seed(1)
dep <- matrix(sample(2,189*10,rep=T),189)-1 

# run one stepwise procedure for each dep variable using rows 1 to 100
# and store result in z so that z[[i]] has output from ith dep variable
z <- apply(dep[1:100,], 2, function(d) 
         step(multinom(formula = d ~., data = birthwt[1:100,-1])))

# calculate number of correct predictions for each model using rows 101 to 189
sapply(z,function(x) sum(predict(x, birthwt[101:189,-1]) == birthwt
[101:189,1]))




More information about the R-help mailing list