[R] Wrong SEs in predict.lm(..., type="terms")

John Maindonald john.maindonald at anu.edu.au
Tue Apr 25 06:42:27 CEST 2000


predict.lm(..., type="terms") gives wrong standard errors.

Below, I have provided what I believe are the necessary fixes.
However, there are subtleties, and the code needs careful
checking.  Some of the looping is surely not necessary, but
it is surely best to begin with the minimum necessary changes.

My tests, including checks against S-PLUS, have extended to fitting
spline curves.  I have not tested the code with any example where
the pivot sequence does not run sequentially from 1 to p1.

For what it is worth, I am using RW-1.0.0 under Windows 98.

df<-data.frame(x=1:9,y=c(2,3,6,4,8,10,12,14,15))
> df.lm_lm(y~x,data=df)
> as.vector(predict(df.lm,se=T,type="terms")$se.fit)
[1] 0.9904885 0.7428664 0.4952442 0.2476221 0.0000000 0.2476221 0.4952442
[8] 0.7428664 0.9904885

Here is what one should get:
> abs((df$x-mean(df$x)))*summary(df.lm)$coef[2,2]
[1] 0.5769836 0.4327377 0.2884918 0.1442459 0.0000000 0.1442459 0.2884918
[8] 0.4327377 0.5769836

Here is the corrected code:


    if (type=="terms"){
      asgn <- attrassign(object)
      beta<-coef(object)
      hasintercept<-attr(tt,"intercept")>0
      if (hasintercept)
        asgn$"(Intercept)"<-NULL
      nterms<-length(asgn)
      predictor<-matrix(ncol=nterms,nrow=NROW(X))
      dimnames(predictor)<-list(rownames(X),names(asgn))

      if (se.fit){
        ip<-matrix(ncol=nterms,nrow=NROW(X))
        dimnames(ip)<-list(rownames(X),names(asgn))
      }
        avX<-apply(X,2,mean)  
        X<-sweep(X,2,avX)    # We'd best sweep out column means
                             # before we start
      for (i in seq(1,nterms,length=nterms)){
        ii<-piv[asgn[[i]]]
	        predictor[,i]<-X[,ii,drop=F]%*%(beta[ii])
        if (se.fit){
          ii2_asgn[[i]]      # X[,piv] matches rows & columns of R
          vci<-R[ii2,ii2]*res.var
          for(j in (1:NROW(X))){
            xi<-X[j,ii,drop=F]  # Do not multiply by beta[ii]
           ip[j,i]<-sum(xi%*% vci %*%t(xi))

          }
        }
      }

Here is the existing (1.0.0) code:

    if (type=="terms"){
      asgn <- attrassign(object)
      beta<-coef(object)
      hasintercept<-attr(tt,"intercept")>0
      if (hasintercept)
        asgn$"(Intercept)"<-NULL
      nterms<-length(asgn)
      predictor<-matrix(ncol=nterms,nrow=NROW(X))
      dimnames(predictor)<-list(rownames(X),names(asgn))
      if (se.fit){
        ip<-matrix(ncol=nterms,nrow=NROW(X))
        dimnames(ip)<-list(rownames(X),names(asgn))
      }
      for (i in seq(1,nterms,length=nterms)){
        if (hasintercept)  # Redundant
          i0<-1            # Redundant
        else               # Redundant
          i0<-NULL         # Redundant
        ii<-piv[asgn[[i]]]
        predictor[,i]<-X[,ii,drop=F]%*%(beta[ii])  # Should be centred
        X[,ii]<-X[,ii]-mean(X[,ii]) # Columns must be centred individually
        if (se.fit){
          vci<-R[ii,ii]*res.var    # The pivot should not be applied to R
          for(j in (1:NROW(X))){
            xi<-X[j,ii,drop=F]*(beta[ii])  # Do not Xply by beta
            ip[j,i]<-sum(xi%*% vci %*%t(xi))
          }
        }
      }


John Maindonald               email : john.maindonald at anu.edu.au        
Statistical Consulting Unit,  phone : (6249)3998        
c/o CMA, SMS,                 fax   : (6249)5549  
John Dedman Mathematical Sciences Building
Australian National University
Canberra ACT 0200
Australia
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list