[R] Interaction term in multiple regression

kfortino at email.unc.edu kfortino at email.unc.edu
Tue Jul 14 03:31:41 CEST 2009


Hello All, Thank you for taking my question.  I am looking for 
information on how R handles interaction terms in a multiple regression 
using the “lm” command.  I originally noticed something was unusual 
when my R output did not match the output from JMP for an identical 
test run previously. Both programs give identical results for the main 
test and if the models do not contain the interaction term then the 
output is identical.  However the results of the partial F tests differ 
dramatically when the interaction term is included.

Here are the results from R of the test with the interaction:

> summary(lm(TD[Year==2007]~Kd[Year==2007]*area[Year==2007], data=boon_tot))

Call:
lm(formula = TD[Year == 2007] ~ Kd[Year == 2007] * area[Year ==    
2007], data = boon_tot)

Residuals:
     Min       1Q   Median       3Q      Max -0.42696 -0.25648 -0.11960 
  0.03151  1.27957

Coefficients:
                                    Estimate Std. Error t value 
Pr(>|t|)  (Intercept)                           5.5714     1.7995   
3.096   0.0148 *
Kd[Year == 2007]                      0.2867     4.0696   0.070   
0.9456  area[Year == 2007]                    0.8192     0.2874   2.851 
   0.0215 *
Kd[Year == 2007]:area[Year == 2007]  -1.8074     0.6320  -2.860   0.0211 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5238 on 8 degrees of freedom
Multiple R-squared: 0.6826,     Adjusted R-squared: 0.5636 F-statistic: 
5.736 on 3 and 8 DF,  p-value: 0.02155

Here are the results from JMP for the same model

Source		df	SS		MS		F		p
Model		3	4.72157318	1.57385773	5.73591141  0.02155127
Error		8	2.19509349	0.27438669
C. Total	11	6.91666667

Source			Est.		Std Error	t value	p > t
Intercept			10.4933505	1.24016642	8.46124381	0.00002911
Kd				-11.213166	2.95096414	-3.7998315	0.00523792
area (ha)			0.04560254	0.03069489	1.48567197	0.17567049
(Kd-0.428)*
(area (ha)-6.3625)	-1.8074455	0.63195669	-2.860078	0.02114887


As you can see although the results of the main test and the 
interaction term are identical, the estimate and std error of the other 
factors are very different.

Additionally if I remove the interaction term from the model, the two 
programs then give identical results.

Any thoughts as to why they differ would be appreciated.

Sincerely
Ken




More information about the R-help mailing list