[R] predictions under NA

Christian Hoffmann christian.hoffmann at wsl.ch
Mon Sep 17 10:46:14 CEST 2001

Hi all,

Maybe I missed something in the help pages, but I am wondering if there is
a staight forward solution to the following problem:

Given several predictors x_i and a regressand y, containing NAs

y_1 = (none of the x_1i contain NA) %*% beta + eps_1
y_2 = (some of the x_2i contain NA) %*% beta + eps_2
y_3 = (none of the x_3i contain NA) %*% beta + eps_3
y_n = (none of the x_ni contain NA) %*% beta + eps_n

Fitting a (linear) model lm to the data will use complete cases only (if
not failing), so that  "fitted.values" may contain only

lm$fitted.values[1]  corresponding to y_1
lm$fitted.values[2]  corresponding to y_3
lm$fitted.values[n-#(of-complete-cases)]  corresponding to y_n

This situation may be inconvenient. To get the full array of predictions I
would have  two possibilities (at least):

1. expected <- model.matrix(x) %*% lm$coefficients

2. judicious use of "complete.cases" and others do something like
  expexted[complete.cases] <- lm$fitted.values
  expexted[-complete.cases] <- NA

1. Seems straight forward and fool proof, but may be computational overkill
in the case of very few NAs.
2. May not be fool proof because of (hidden, at first glance unrecognized)
internals in lm (like rearrangements ?)

Does anybody have any thoughts on this?

Thank you very much.


Dr.sc.math.Christian W. Hoffmann
Mathematics and Statistical Computing
Landscape Modeling and Web Applications
Swiss Federal Research Institute WSL 
Zuercherstrasse 111
CH-8903 Birmensdorf, Switzerland
phone: ++41-1-739 22 77    fax: ++41-1-739 22 15
e-mail: christian.hoffmann_at_wsl.ch__prevent_spamming
www: http://www.wsl.ch/staff/christian.hoffmann/

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list