[R] poly(x) workaround when x has missing values

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jan 25 08:50:08 CET 2007


Orthpgpnality of polynomials is not defined if they contain missing 
values, which seems a good enough reason to me.

Put it another way, in your solution whether the columns are orthogonal 
depends on the unknown values of the NAs, and it looks like is only true 
if the unknown values are all zero.

On Wed, 24 Jan 2007, Jacob Wegelin wrote:

>
> Often in practical situations a predictor has missing values, so that poly
> crashes. For instance:
>
>> x<-1:10
>> y<- x -  3 * x^2 + rnorm(10)/3
>> x[3]<-NA
>> lm( y ~ poly(x,2) )
> Error in poly(x, 2) : missing values are not allowed in 'poly'
>>
>> lm( y ~ poly(x,2) , subset=!is.na(x)) # This does not help?!?
> Error in poly(x, 2) : missing values are not allowed in 'poly'
>
> The following function seems to be an okay workaround.
>
> Poly<- function(x, degree = 1, coefs = NULL, raw = FALSE, ...) {
>        notNA<-!is.na(x)
>        answer<-poly(x[notNA], degree=degree, coefs=coefs, raw=raw, ...)
>        THEMATRIX<-matrix(NA, nrow=length(x), ncol=degree)
>        THEMATRIX[notNA,]<-answer
>        attributes(THEMATRIX)[c('degree', 'coefs', 'class')]<- attributes(answer)[c('degree', 'coefs', 'class')]
>        THEMATRIX
> }
>
>
>>  lm( y ~ Poly(x,2) )
>
> Call:
> lm(formula = y ~ Poly(x, 2))
>
> Coefficients:
> (Intercept)  Poly(x, 2)1  Poly(x, 2)2
>      209.1        475.0        114.0
>
> and it works when x and y are in a dataframe too:
>
>> DAT<-data.frame(x=x, y=y)
>> lm(y~Poly(x,2), data=DAT)
>
> Call:
> lm(formula = y ~ Poly(x, 2), data = DAT)
>
> Coefficients:
> (Intercept)  Poly(x, 2)1  Poly(x, 2)2
>    -119.54      -276.11       -68.24
>
> Is there a better way to do this? My workaround seems a bit awkward.
> Whoever wrote "poly" must have had a good reason for not making it deal
> with missing values?
>
> Thanks for any thoughts
>
> Jacob Wegelin
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list