[R] Problem with ldply

cfriedl cfriedalek at gmail.com
Mon May 17 07:20:45 CEST 2010

I've examining a number of linear regression models on a large dataset
following the basic ideas presented here 
Calculating all possible linear regressions . I run into a problem with
ldply when I have a formula that includes no intercept. Here's a simple test
to show what happens.

# data and two linear model regressions
xy <- data.frame(cbind(x=(0:10),y=2*x + 0.2*rnorm(11)))
models <- as.list(c('y ~ x', 'y ~ -1 + x'))
models <- lapply(models, function(x) (as.formula(x)) )
fits <- lapply(models, function(x) lm(x, data=xy))

# regression summaries specified individually (OK)

#               Estimate Std. Error     t value     Pr(>|t|)
# (Intercept) -0.0594176 0.10507394  -0.5654837 5.855640e-01
# x            2.0163534 0.01776074 113.5286997 1.620614e-15


#   Estimate Std. Error  t value     Pr(>|t|)
# x 2.007865 0.00916494 219.0811 9.652427e-20

# Coefficients as a dataframe using ldply (OK)
ldply(fits, function(x) as.data.frame(t(coef(x))))

#   (Intercept)        x
# 1  -0.0594176 2.016353
# 2          NA 2.007865

# Std Errors as a dataframe using ldply  (FAIL)
# variable name 'x' is missed in the second model which has no intercept.
Default variable
# name V1 is added to the output instead.
# The same behaviour is observed for 't value' and 'Pr(>|t|)'
ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std. Error'])))

#   (Intercept)          x         V1
# 1   0.1050739 0.01776074         NA
# 2          NA         NA 0.00916494

Is this a bug or (hopefully) user error? Any ideas for a workaround?


View this message in context: http://r.789695.n4.nabble.com/Problem-with-ldply-tp2219094p2219094.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list