[R] Formula for calculating interaction terms in R

Stephen Montgomery sm8 at sanger.ac.uk
Mon Nov 9 18:37:37 CET 2009


Hello -

I am trying to figure out R's transformation for interaction terms in a
linear regression.

My simple background understanding is that interaction terms are
generally calculated by multiplying the centred (0-mean) variables with
each other and then doing the regression.  However, in this regard I
would have expected to see the same p-value when I calculate

summary(lm(Y~A:B))

for A:B
as when I calculate

T<-scale(A) * scale(B)
summary(lm(Y~T))
for T

Is this correct?  Apologizes if this is overly trivial or I am missing
something here.  In the example below you can see that p-value(A:B) !=
p-value(T).

Thanks for the help!

All the best,
Stephen

PS My goal is to determine given two variables what transform I can use
to create one variable representative of their interaction which can be
tested in a regression framework against the response variable.  And
have corresponding p-values.


> foo
    Y A B
R0  1 2 2
R1  2 2 1
R2  3 4 1
R3  4 2 2
R4  5 1 1
R5  6 3 2
R6  7 3 3
R7  8 4 3
R8  9 3 1
R9 10 2 4

> summary(lm(foo$Y~foo$A:foo$B))

Call:
lm(formula = foo$Y ~ foo$A:foo$B)

Residuals:
   Min     1Q Median     3Q    Max 
-3.869 -1.619 -0.524  1.230  4.616 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   2.9274     1.6192   1.808   0.1082  
foo$A:foo$B   0.4854     0.2603   1.865   0.0992 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 2.681 on 8 degrees of freedom
Multiple R-squared: 0.303,      Adjusted R-squared: 0.2159 
F-statistic: 3.478 on 1 and 8 DF,  p-value: 0.09918

> T<-scale(foo$A) * scale(foo$B)
> summary(lm(foo$Y~T))

Call:
lm(formula = foo$Y ~ T)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5247 -2.5341  0.1729  2.5094  4.1788 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   5.5247     1.0182   5.426 0.000627 ***
T            -0.2516     1.1181  -0.225 0.827614    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 3.201 on 8 degrees of freedom
Multiple R-squared: 0.006289,   Adjusted R-squared: -0.1179 
F-statistic: 0.05063 on 1 and 8 DF,  p-value: 0.8276




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.




More information about the R-help mailing list