[R] Linear Model and Missing Data in Predictors

William Dunlap wdunlap at tibco.com
Tue Mar 15 17:47:07 CET 2016


One technique for dealing with this is called 'multiple imputation'.
Google for 'multiple imputation in R' to find R packages that implement
it (e.g., the 'mi' package).

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Mar 15, 2016 at 8:14 AM, Lorenzo Isella <lorenzo.isella at gmail.com>
wrote:

> Dear All,
> A situation that for sure happens very often: suppose you are in the
> following situation
>
> set.seed(1235)
> x1 <- seq(30)
> x2 <- c(rep(NA, 9), rnorm(19)+9, c(NA, NA))
> x3 <- c(rnorm(17)-2, rep(NA, 13))
>
> y <- exp(seq(1,5, length=30))
>
>
> mm<-lm(y~x1+x2+x3)
>
> i.e. you try a simple linear regression with multiple regressors
> which exhibit some missing values.
> This is what happens to me while working with some time series which I
> use as regressors and whose missing values are padded with NAs.
> lm, as a default, disregard the sets of incomplete observations and
> therefore drops quite a lot of data.
> Is there any way to circumvent this? I mean, is there a way to somehow
> come up with a piecewise linear regression where, whenever possible,
> all the 3 regressors are used but we switch to 1 or 2 when there are
> missing data?
> I say this because it is totally unfeasible to try to figure out the
> values of the missing data in my regressors, but at the same time I
> cannot restrict my model to the intersection of the non-NA values in
> the 3 regressors. If this makes sense, do I have to code it myself or
> is there any package which already implemented this?
> Any suggestion is appreciated.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list