[R] Error in predict.lm() for models without intercept?
Rolf Turner
r.turner at auckland.ac.nz
Wed Sep 17 03:00:58 CEST 2014
When I run your code (in the single predictor case) I get exactly what I
would expect. In particular the standard errors are indeed proportional
to the (absolute) value of x, and the standard error is indeed 0 at x = 0.
The proportionality constant is exactly what it should be, explicitly
s/sqrt(sum(x^2)) where "s" is the estimate of sigma (obtained via
summary(mod)$sigma).
So all is in harmony in the universe.
It is ***VERY*** unlikely that there is any error in predict.lm(), which
has been around and in heavy use for a long, long time.
I don't know what you are seeing to make you think that there is an
error, but I am not seeing anything untoward.
cheers,
Rolf Turner
On 17/09/14 10:51, isabella at ghement.ca wrote:
> Hi everyone,
>
> It appears my R code didn't come through the first time (thanks for letting me know, Ista). Here is my message again:
>
> Could there be an error in the predict() function in R for models without intercepts which include one or more predictors?
> When using the predict() function to obtain the standard errors of the fitted values produced by a linear model (lm), the
> behaviour of the standard errors seems to mirror that of models which include an intercept (which should not happen).
>
> Here is an attempt to produce standard errors (and hence confidence bands) for the fitted values in a linear model with a
> single predictor and no intercept using R code:
>
> ## simulate a response y and two predictors x and z
>
> x <- rnorm(100,mean=0, sd=1)
>
> z <- runif(100,min=-1,max=1)
>
> y <- 1*x + 2*z + rnorm(100,mean=0, sd=1)
>
>
> ## fit a linear model with no intercept but with one predictor
>
> mod <- lm(y ~ 0 + x)
>
> ## compute confidence bands (i.e., fitted values +/- 1.96 standard errors of fitted values)
>
> conf.band.x <- predict(mod,newdata=data.frame(x = seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01)),
> interval="confidence")
>
> ## display confidence bands
>
> conf.band.x <- data.frame(lwr=conf.band.x[,"lwr"],
> fit=conf.band.x[,"fit"],
> upr=conf.band.x[,"upr"])
>
> matplot(x=seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01), y=conf.band.x, type="l", xlab="x", ylab="y")
> abline(v=mean(x),lty=3,col="magenta")
> title("Effect of x on y")
>
> According to statistical theory, in a model with no intercept and one predictor, the standard errors should be directly
> proportional to the value of x at which they are evaluated. If x=0, the standard errors should also be zero. If x increases,
> the standard errors should also increase. The resulting plot produced by matplot shows this is not the case - the standard
> errors appear to increase as one moves away from the average value of x. We would expect this behaviour if the model included
> an intercept, which is not the case here.
>
> Here is some R code for looking at standard errors of fitted values when the model includes no intercept and two predictors x
> and z. In this code, the value of the predictor z is set to its average level.
> ## linear model with no intercept but with two predictors
>
> mod <- lm(y ~ 0 + x + z)
>
> conf.band.x <- predict(mod,newdata=data.frame(x = seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01),
> z = mean(z)),
> interval="confidence")
>
> conf.band.x <- data.frame(lwr=conf.band.x[,"lwr"],
> fit=conf.band.x[,"fit"],
> upr=conf.band.x[,"upr"])
>
> matplot(x=seq(from=ceiling(min(x)),to=floor(max(x)),by=0.01), y=conf.band.x, type="l", xlab="x", ylab="y")
> abline(v=mean(x),lty=3,col="magenta")
> title("Partial Effect of x on y (obtained by setting z to its average level)")
>
> Again, the standard errors seem to behave as though they would come from a model with an intercept (given that they are
> flaring up as one moves away from the average value of the predictor x).
>
> I would very much appreciate any clarifications or suggestions for how to fix this problem.
>
> If the problem is confirmed, it appears to also carry over to the effects package in R, which constructs plots similar to the
> ones produced by matplot above by relying on the predict() function.
--
Rolf Turner
Technical Editor ANZJS
More information about the R-help
mailing list