[R] Is this *always* the intended R^2 value for no intercept in lm?

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Mon Nov 7 07:46:56 CET 2022


To Thierry; When you omit an intercept you require that the line in multivariate space that represents the ‘predictions’ go through the (0,0,0,…) I.e. the origin. It’s a fairly restrictive requirement. There IS an intercept, even though it’s not explicitly seen in the model. If it’s not required by theory in your domain of investigation, you are advised to avoid such a practice.

— 
David. 

Sent from my iPhone

> On Nov 5, 2022, at 12:41 PM, Bert Gunter <bgunter.4567 using gmail.com> wrote:
> 
> FAQ 7.41
> and
> https://stackoverflow.com/questions/57415793/r-squared-in-lm-for-ero-intercept-model
> 
> (among numerous others that could no doubt be found with a bit of
> searching).
> 
> In short, the "null models" against which you are comparing the fitted
> model are different with and without an intercept.
> 
> --Bert
> 
> 
> 
>> On Sat, Nov 5, 2022 at 11:52 AM Thierry Zell <thierry.zell using gmail.com> wrote:
>> 
>> I am puzzled by the computation of R^2 with intercept omitted that is
>> already illustrated by the following example taken from help("lm")
>> 
>> ## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
>> ## Page 9: Plant Weight Data.
>> ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
>> trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
>> group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
>> weight <- c(ctl, trt)
>> lm.D9 <- lm(weight ~ group)
>> lm.D90 <- lm(weight ~ group - 1) # omitting intercept
>> 
>> The calculations for the R^2 for both models  are consistent with the
>> help("summary.lm") description:
>> "y* is the mean of y[i] if there is an intercept and zero otherwise."
>> Which causes a dramatic difference in the resulting R^2 values.
>> 
>> r2.D9 <- summary(lm.D9)$r.squared
>> r2.D90 <- summary(lm.D90)$r.squared
>> 
>> all.equal(r2.D9, 0.0730775989903856) #TRUE
>> all.equal(r2.D90, 0.981783272435264) #TRUE
>> 
>> This is counter-intuitive to say the least since the two models have
>> identical predictions and both models could be described more
>> accurately as two intercepts rather than zero. I see three
>> possibilities:
>> 
>> 1. This is the intended result, in which case no fix is required, but
>> I’d be curious to understand the argument better.
>> 2. This is an unfortunate outcome but not worth fixing as the user can
>> easily compute the correct R^2. In this case, I'd suggest that this
>> unintuitive behavior should be explicitly called out in the
>> documentation.
>> 3. This is a bug worth fixing.
>> 
>> I look forward to hearing the community’s opinion on this.
>> Thanks in advance!
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list