John Sorkin
Fri Jul 13 19:22:30 CEST 2012
Pamela
R squared with a non-zero, and with a zero intercept can be very different as the regression line that you get with and without a zero intercept can be very different. Have you plotted your data plot(k[,2],k[,1]) to see if a zero intercept is reasonable for your data? Have you drawn the regression lines that you get from your models and compared the lines to the plots of your data?
John
Hi,
I have been using lm in R to do a linear regression and find the slope
coefficients and value for R-squared. The R-squared value reported by R
(R^2 = 0.9558) is very different than the R-squared value when I use the
same equation in Exce (R^2 = 0.328). I manually computed R-squared and the
Excel value is correct. I show my code for the determination of R^2 in R.
When I do not set 0 as the intercept, the R^2 value is the same in R and
Excel. In both cases the slope coefficient from R and from Excel are
identical.
k is a data frame with two columns.
M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
R^2 values in R and Excel
M2 = lm(k[,1]~k[,2])
sumM1 = summary(M1)
sumM2 = summary(M2) ## get same value as Excel when intercept is not
set to 0
Below is what R returns for sumM1:
lm(formula = k[, 1] ~ k[, 2] + 0)
Residuals:
Min 1Q Median 3Q Max
-0.057199 -0.015857 0.003793 0.013737 0.056178
Coefficients:
Estimate Std. Error t value Pr(>|t|)
k[, 2] 1.05022 0.04266 24.62 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.02411 on 28 degrees of freedom
Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
F-statistic: 606.2 on 1 and 28 DF, p-value: < 2.2e-16
Way manual determination was performed. The value returned coincides with
the value from Excel:
#### trying to figure out why the R^2 for R and Excel are so different.
sqerr = (k[,1] - predict(M1))^2
sqtot = (k[,1] - mean(k[,1]) ^2
R2 = 1 - sum(sqerr)/sum(sqtot) ## for 1D get 0.328 same as
excel value
I am very puzzled by this. How does R compute the value for R^2 in this
case? Did i write the lm incorrectly?
Thanks
Pam
PS In case you are interested, the data I am using for hte two columns is
below.
k[, 1]
1]
[1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
0.06358752 0.07945473
[15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
0.08419016 0.11995240
[29] 0.14425678
> k[,2]
[1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12
--
