[R] Error when using step

Chris Friedl cfriedalek at gmail.com
Mon Jun 22 08:53:05 CEST 2009


I have two questions about the built-in function step. Ultimately I want to
apply a lm fitting and subsequent step procedure to thousands of data sets
groups by a factor defined as a unique ID.

Q1. The code below creates a data.frame comprising three marginally noisy
surfaces. The code below fails when I use step in a function but summary
seems to show the model fits are legitimate. Any ideas on what I'm doing
wrong?

# y =      x1 +  x2          for grp A
# y = 2 + 2x1 + 4x2          for grp B
# y = 3 + 2x1 + 4x2 + 6x1x2  for grp C
ind <- matrix(runif(200), ncol=2, dimnames=list(c(), c("x1","x2")))
d1 <- data.frame(ind, y=ind[,"x1"]+ind[,"x2"]+rnorm(100,0,0.05),
grp=rep("A",100))
d2 <- data.frame(ind,
y=2+2*ind[,"x1"]+4*ind[,"x2"]+rnorm(100,0,0.05),grp=rep("B", 100))
d3 <- data.frame(ind,
y=3+2*ind[,"x1"]+4*ind[,"x2"]+6*ind[,"x1"]*ind[,"x2"]+rnorm(100,0,0.05),grp=rep("C",100))
data2 <- rbind(d1,d2,d3)
# Fit each surface by grp
model <- y ~ x1*x2
fits <- lapply(split(data2, list(data2$grp), drop=TRUE),
function(x){lm(model, data = x)})
lapply(fits, function(x){summary(x)})

# FAIL
lapply(fits, function(x){step(x)})

# Output

Start:  AIC=-601.41
y ~ x1 * x2

        Df Sum of Sq     RSS     AIC
- x1:x2  1   0.00159    0.23 -602.71
<none>                  0.23 -601.41
Error in as.data.frame.default(data) : 
  cannot coerce class "lm" into a data.frame



I get a different error if I use step directly on the first fit in the list.

step(fits[[1]])

# Output

Start:  AIC=-601.41
y ~ x1 * x2

        Df Sum of Sq     RSS     AIC
- x1:x2  1   0.00159    0.23 -602.71
<none>                  0.23 -601.41
Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one


However step works if I do the fit outside of a function.

fit <- lm(model, data=data2[which(data2$grp=="A"),])
step(fit)

# Output

Start:  AIC=-601.41
y ~ x1 * x2

        Df Sum of Sq     RSS     AIC
- x1:x2  1   0.00159    0.23 -602.71
<none>                  0.23 -601.41

Step:  AIC=-602.71
y ~ x1 + x2

       Df Sum of Sq     RSS     AIC
<none>                 0.23 -602.71
- x2    1      8.36    8.58 -241.54
- x1    1      9.55    9.78 -228.47

Call:
lm(formula = y ~ x1 + x2, data = data2[which(data2$grp == "A"),     ])

Coefficients:
(Intercept)           x1           x2  
   0.001457     0.999313     1.010880  


Q2. How can I setup step to exclude the intercept. In this example above
both the intercept and interaction terms are insignificant. I have tried
setting direction to "both" and "forward" but that didn't help.

Thanks for your help.


-- 
View this message in context: http://www.nabble.com/Error-when-using-step-tp24142487p24142487.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list