[Rd] lm considers removed predictors when finding complete cases

EDUARDO GARCIA PORTUGUES edgarcia at est-econ.uc3m.es
Tue Dec 19 20:12:14 CET 2017


Dear R-devel list,

I realized that removing a predictor in lm through the "-"'s operator in
formula() does not affect the complete cases that are considered. A minimal
example is:

summary(lm(Wind ~ ., data = airquality))
# 42 observations deleted due to missingness

summary(lm(Wind ~ . - Ozone, data = airquality))
# still 42 observations deleted due to missingness, even if only 7 are
# missing for the response and the rest of the predictors

summary(lm(Wind ~ ., data = subset(airquality, select = -Ozone)))
# 7 observations deleted due to missingness

I find this behaviour somehow striking and I was wondering whether it is
intended, or whether it would be appropriate to document it in lm's help.

Any insight on this issue is appreciated.

Best regards,
-- 
Eduardo García Portugués
Assistant professor
Department of Statistics
Carlos III University of Madrid

Office: 7.3.J21 (Leganés)
Phone: (+34) 91624 8836

	[[alternative HTML version deleted]]



More information about the R-devel mailing list