[R] Workings of model.frame.default and [.
Frank E Harrell Jr
fharrell at virginia.edu
Thu Jul 25 04:27:12 CEST 2002
I sent this note last Friday just before the weekend and didn't get any replies. I'm sending it again in the hope that someone will offer some insight. -Frank
Related to my earlier question to which I received very helpful replies, when I provide a subsetting method that automatically drops unused levels of a factor variable, I am getting into a bit of trouble using model.frame.default. I know that model.frame.default has its own mechanism for dropping unused levels, but my personal preference is to handle this on a more basic level using [.factor and to not specify drop.unused.levels=TRUE to model.frame.default. That way subsetting operations that are not carried out by model.frame also work the way I want, especially [.data.frame when I attach or otherwise reference a subset of a data frame.
Inside model.frame.default, a 'variables' list is constructed. For factor variables this has all the original levels. Then .Internal(model.frame()) is invoked. This will invoke my local [.factor which drops unused levels. However, model.frame is affected by the disparity in levels between what's in 'variables' and what is returned during [.data.frame (which calls [.factor), causing model.frame to return an invalid factor variable in which levels are shifted and some real levels at the end have zero frequencies [I am leaving `drop.unused.levels'=FALSE when running model.frame].
Is model.frame doing this by intentional design? If not, can it be fixed? It seems to me that to be general .Internal(model.frame()) should not depend on levels not changing when [.data.frame is executed. If model.frame really needs to operate this way, does anyone see a workaround?
Thanks again, and I'll put in one more plug for [.factor to be modified so that if a system option 'drop.unused.levels' is TRUE (i.e., NOT by default) drop=TRUE is assumed unless drop=FALSE is explicitly stated by the user. Then I can dispose of my local [.factor once and for all.
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help