[Rd] lm considers removed predictors when finding complete cases

EDUARDO GARCIA PORTUGUES edgarcia at est-econ.uc3m.es
Tue Dec 19 20:12:14 CET 2017

Dear R-devel list,

I realized that removing a predictor in lm through the "-"'s operator in
formula() does not affect the complete cases that are considered. A minimal
example is:

summary(lm(Wind ~ ., data = airquality))
# 42 observations deleted due to missingness

summary(lm(Wind ~ . - Ozone, data = airquality))
# still 42 observations deleted due to missingness, even if only 7 are
# missing for the response and the rest of the predictors

summary(lm(Wind ~ ., data = subset(airquality, select = -Ozone)))
# 7 observations deleted due to missingness

I find this behaviour somehow striking and I was wondering whether it is
intended, or whether it would be appropriate to document it in lm's help.

Any insight on this issue is appreciated.

Best regards,
Eduardo García Portugués
Assistant professor
Department of Statistics
Carlos III University of Madrid

Office: 7.3.J21 (Leganés)
Phone: (+34) 91624 8836

	[[alternative HTML version deleted]]

More information about the R-devel mailing list