[R] lm() using data frames

Simon Fear fears at roycastle.liv.ac.uk
Tue Mar 9 11:59:07 CET 1999


Under NT 4.0, using Version 0.63.2 Beta (Jan 12, 1999):

Not sure if this is a bug or a feature (forcing me to program less
clumsily) so I'll report it here rather than to bugs.

With a medium size data set (1700 observations,70 explanatory variables)
and plenty of memory, specifically

> gc()
          free   total
Ncells  886738 1000000
Vcells 7912909 8388608

I get a fatal error when attempting summary() on the fit of an lm() on a
large-ish set of dummy variables (stored in a matrix):

Call:
lm(formula = sasch2[, "ddiff"] ~ sasch2[, "td30"] + sasch2[,     "td60"]
+ sasch2[, "td90"] + sasch2[,  "td120"] + +sasch2[,     "td180"] +
sasch2[, "td240"] + sasch2[, "td300"] + sasch2[,     "td360"] +
sasch2[, "td420"] + sasch2[, "td480"] + sasch2[,     "db1"] + sasch2[,
"db1.5"] + sasch2[, "db2"] +  sasch2[, "db2.5"] +     +sasch2[, "db3.5"]
+ sasch2[, "db4"] + sasch2[, "db4.5"] +     sasch2[, "db5"] +  sasch2[,
"db5.5"] + +sasch2[, "db6"] +     sasch2[, "db6.5"] + sasch2[, "db7"] +
sasch2[, "db7.5"] +      sasch2[, "db8"] + +sasch2[, "db8.5"] + sasch2[,
"db9"] +     sasch2[, "db9.5"])

I get estimates OK, but summary() collapses. However, if I do the same
thing less clumsily, by writing all the relevant variables to a new data
frame, and then calling

Call:
lm(formula = ddiff ~ ., data = dtmp)

I get not only the estimates but can also summary() with no problem.

Any ideas why? Seems to be memory-linked, because I can lm() and
summary() the matrix versions using only the sasch2[,'td*'] or db*
variable sets.

Simon Fear

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list