[R] [FORGED] standard error for regression coefficients corresponding to factor levels

Fri Mar 17 15:44:37 CET 2017

Just to do a better job in what is perhaps more "R-ish" w.r.t least squares calculations, here is perhaps a better example

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)

X <- model.matrix(lm.D9)
Xqr <- qr(X)
Q <- qr.Q(Xqr)
R <- qr.R(Xqr)

### this is same as just (X'X)^-1
 chol2inv(R)

### Compared to the prior way I mentioned

 solve(crossprod(X))

-----Original Message-----
From: Doran, Harold 
Sent: Friday, March 17, 2017 5:06 AM
To: Rolf Turner <r.turner at auckland.ac.nz>; li li <hannah.hlx at gmail.com>
Cc: r-help <r-help at r-project.org>
Subject: Re: [R] [FORGED] standard error for regression coefficients corresponding to factor levels

A slightly more ³R-ish² way of doing

S <- solve(t(X)%*%X)

Is to instead use

S <- solve(crossprod(X))

And the idea idea of inverting the SSCP matrix only and not actually solving the linear system is not so great, which is why it is better to do as Rolf is suggesting and get all things you need from lm, which uses decompositions and not the algebraic representations for the solution to the linear system.

On 3/16/17, 7:41 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

>
>You have been posting to the R-help list long enough so that you should 
>have learned by now *not* to post in html.  Your code is mangled so as 
>to be unreadable.
>
>A few comments:
>
>(1) Your data frame "data1" seems to have a mysterious (and 
>irrelevant?) column named "data1" as well.
>
>(2) The covariance matrix of your coefficient estimates is indeed (as 
>you hint) a constant multiple of (X^T X)^{-1}.  So do:
>
>     X <- model.matrix(~response*week,data=data1)
>     S <- solve(t(X)%*%X)
>     print(S)
>
>and you will see the same pattern of constancy that your results exhibit.
>
>(3) You could get the results you want much more easily, without all 
>the fooling around buried in your (illegible) code, by doing:
>
>     mod <- lm(response ~ (region - 1)/week,data=data1)
>     summary(mod)
>
>cheers,
>
>Rolf Turner
>
>--
>Technical Editor ANZJS
>Department of Statistics
>University of Auckland
>Phone: +64-9-373-7599 ext. 88276
>
>On 17/03/17 07:26, li li wrote:
>> Hi all,
>>   I have the following data called "data1". After fitting the ancova 
>>model  with different slopes and intercepts for each region, I 
>>calculated the  regression coefficients and the corresponding standard 
>>error. The standard  error (for intercept or for slope) are all the 
>>same for different regions.
>> Is there something wrong?
>>   I know the SE is related to (X^T X)^-1, where X is design matrix. 
>>So does  this happen whenever each factor level has the same set of 
>>values for  "week"?
>>      Thanks.
>>      Hanna
>>
>>
>>
>>> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> 
>>>res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- 
>>>tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> 
>>>res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> 
>>>res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)
>>
>>> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")>
>>>rownames(res) <- letters[1:5]> res   intercept intercept SE
>>>slope   slope SE
>> a 0.18404464   0.08976301 -0.018629310 0.01385073
>> b 0.17605666   0.08976301 -0.022393789 0.01385073
>> c 0.16754130   0.08976301 -0.022367770 0.01385073
>> d 0.12554452   0.08976301 -0.017464385 0.01385073
>> e 0.06153256   0.08976301  0.007714685 0.01385073
>>
>>
>>
>>
>>
>>
>>
>>> data1    week region     response
>> 5      3      c  0.057325067
>> 6      6      c  0.066723632
>> 7      9      c -0.025317808
>> 12     3      d  0.024692613
>> 13     6      d  0.021761492
>> 14     9      d -0.099820335
>> 19     3      c  0.119559235
>> 20     6      c -0.054456186
>> 21     9      c  0.078811180
>> 26     3      d  0.091667189
>> 27     6      d -0.053400777
>> 28     9      d  0.090754363
>> 33     3      c  0.163818085
>> 34     6      c  0.008959741
>> 35     9      c -0.115410852
>> 40     3      d  0.193920693
>> 41     6      d -0.087738914
>> 42     9      d  0.004987542
>> 47     3      a  0.121332285
>> 48     6      a -0.020202707
>> 49     9      a  0.037295785
>> 54     3      b  0.214304603
>> 55     6      b -0.052346480
>> 56     9      b  0.082501222
>> 61     3      a  0.053540767
>> 62     6      a -0.019182819
>> 63     9      a -0.057629113
>> 68     3      b  0.068592791
>> 69     6      b -0.123298216
>> 70     9      b -0.230671818
>> 75     3      a  0.330741562
>> 76     6      a  0.013902905
>> 77     9      a  0.190620360
>> 82     3      b  0.151002874
>> 83     6      b  0.086177696
>> 84     9      b  0.178982656
>> 89     3      e  0.062974799
>> 90     6      e  0.062035391
>> 91     9      e  0.206200831
>> 96     3      e  0.123102197
>> 97     6      e  0.040181790
>> 98     9      e  0.121332285
>> 103    3      e  0.147557564
>> 104    6      e  0.062035391
>> 105    9      e  0.144965770
>>
>> 	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.