[Rd] Strange behavior of model.frame() when given subset

Liaw, Andy andy_liaw at merck.com
Wed Apr 25 14:34:00 CEST 2012

Dear R-devel,

I recent got a bug report from a locfit user about the use of the subset argument when calling locfit().  Basically the symptom is that the following two calls should produce the same result, but they don't:

locfit(y ~ lp(x, h=1), data=subset(dat, x > 1))
locfit(y ~ lp(x, h=1), data=dat, subset= x > 1)

I've tracked the problem down to something shown in the following example, but have no idea how to get further:

R> x <- 1:5
R> y <- sample(5)
R> m1 <- model.frame(y ~ lp(x))
R> m2 <- model.frame(y ~ lp(x), subset=x>1)
R> class(m1[[2]])
[1] "lp"
R> class(m2[[2]])
[1] "matrix"

So basically model.frame() seems to treat the lp() term differently depending on whether the subset argument is present or not.  Is this supposed to happen?  str(m1) and str(m2) show that besides having one row less and the lp() term being of class matrix instead of "lp", there's no difference between m1 and m2.

I'd really appreciate it if anyone shed some light on this.

Merck Research Labs

