[R] odd behavior of summary()$r.squared

Sundar Dorai-Raj sundar.dorai-raj at PDF.COM
Wed Oct 6 21:21:04 CEST 2004



J.R. Lockwood wrote:

> I may be missing something obvious here, but consider the following simple
> dataset simulating repeated measures on 5 individuals with pretty strong
> between-individual variance.
> 
> set.seed(1003)
> n<-5
> v<-rep(1:n,each=2)
> d<-data.frame(factor(v),v+rnorm(2*n))
> names(d)<-c("id","y")
> 
> Now consider the following two linear models that provide identical fitted
> values, residuals, and estimated residual variance:
>   
> m1<-lm(y~id,data=d)
> m2<-lm(y~id-1,data=d)
> print(max(abs(fitted(m1)-fitted(m2))))
> 
> The r-squared reported by summary(m1) appears to be correct in that it is
> equal to the squared correlation between the fitted and observed values:
> 
> print(summary(m1)$r.squared - cor(fitted(m1),d$y)^2)
> 
> However, the same is not true of m2.
> 
> print(summary(m2)$r.squared - cor(fitted(m2),d$y)^2)
> 
> 
>>R.version
> 
>          _
> platform i686-pc-linux-gnu
> arch     i686
> os       linux-gnu
> system   i686, linux-gnu
> status
> major    1
> minor    9.0
> year     2004
> month    04
> day      12
> language R

I think what you're trying to do is better accomplished by looking at 
the anova table of the two results

a1 <- anova(m1)
a2 <- anova(m2)
r2.1 <- a1[1, 2]/sum(a1[, 2])
r2.2 <- a2[1, 2]/sum(a2[, 2])

summary(m1)$r.squared - r2.1
summary(m2)$r.squared - r2.2

The result you used above using "cor" still adjusts your data for the 
grand mean, which m2 doesn't fit.

HTH,

--sundar




More information about the R-help mailing list