[R] Regression query

Devshruti Pahuja devshruti at hotmail.com
Fri Jun 11 11:26:00 CEST 2004


Hi

I have a set of data with both quantitative and categorical predictors.
After scaling of response variable, i looked for multicollinearity (VIF
values)
among the predictors and removed the predictors who were hinding some of the
other significant
predictors. I'm curious to know whether the predictors (who are not
significant)
while doing simple 'lm' will be involved in interactions. How do i take into
account
interactions of those predictors whom i removed just on the basis of
multicollinearity ?

I'll appreciate if someone can throw some light on this matter and how to
use R to detect
the interactions effectively .

Thanks

Regards
Dev

------Final 'lm model'--------------------
> logmodelfull_minus_run_hr_walk_batting <- lm(log(salary) ~ hit+rbi + walk
+ obp + strike.out+free.agent.eligible+free.agent.1991+arbitr.elgible.)
> summary(logmodelfull_minus_run_hr_walk_batting)

Call:
lm(formula = log(salary) ~ hit + rbi + walk + obp + strike.out +
    free.agent.eligible + free.agent.1991 + arbitr.elgible.)

Residuals:
     Min       1Q   Median       3Q      Max
-2.41786 -0.28911 -0.02814  0.31890  1.49007

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)
(Intercept)           5.340782   0.251218  21.260  < 2e-16 ***
hit                   0.004479   0.001158   3.867 0.000133 ***
rbi                   0.011102   0.002195   5.059 7.05e-07 ***
walk                  0.005421   0.002206   2.457 0.014533 *
obp                  -1.385584   0.824105  -1.681 0.093653 .
strike.out           -0.005399   0.001438  -3.755 0.000205 ***
free.agent.eligible1  1.611521   0.080657  19.980  < 2e-16 ***
free.agent.19911     -0.301243   0.103481  -2.911 0.003848 **
arbitr.elgible.1      1.293059   0.086696  14.915  < 2e-16 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Residual standard error: 0.5351 on 328 degrees of freedom
Multiple R-Squared: 0.7981,     Adjusted R-squared: 0.7932
F-statistic: 162.1 on 8 and 328 DF,  p-value: < 2.2e-16

----------------------------------------------------------------------------
----------------------------------------------------


--------------with
interactions----------------------------------------------------------------
---------------------------

>
> summary(baseball.lgmodel_with_interactions_ALL_arbid)

Call:
lm(formula = log(salary) ~ hit + rbi + strike.out + free.agent.eligible +
    free.agent.1991 + arbitr.elgible. + hit * free.agent.1991 +
    hit * arbitr.elgible. + hit * rbi + rbi * free.agent.eligible +
    rbi * arbitr.elgible. + rbi * arbitr.1991 + hit * strike.out +
    strike.out * free.agent.eligible + strike.out * arbitr.elgible. +
    strike.out * run + strike.out * hr + hit * free.agent.eligible +
    free.agent.eligible * run + hit * free.agent.1991 + strike.out *
    free.agent.1991 + free.agent.1991 * batting + free.agent.1991 *
    obp + arbitr.elgible. * run + batting * double + obp * run +
    obp * hr + walk * stolen.base + hit * arbitr.1991 + free.agent.eligible
*
    double + arbitr.elgible. * double + strike.out * triple +
    triple * batting + triple * walk + triple * walk + hit *
    hr + rbi * hr + free.agent.eligible * hr + free.agent.1991 *
    hr + arbitr.elgible. * hr + hr * arbitr.1991 + hit * walk +
    free.agent.eligible * walk + walk * rbi + rbi * stolen.base +
    strike.out * stolen.base + stolen.base * batting + stolen.base *
    walk + stolen.base * rbi + stolen.base * walk + arbitr.elgible. *
    error)

Residuals:
     Min       1Q   Median       3Q      Max
-2.29352 -0.28287 -0.03748  0.29790  1.31590

Coefficients:
                                  Estimate Std. Error t value Pr(>|t|)
(Intercept)                      5.217e+00  3.467e-01  15.048  < 2e-16 ***
hit                              6.927e-03  6.226e-03   1.112 0.266889
rbi                              1.908e-02  1.150e-02   1.658 0.098350 .
strike.out                      -5.692e-03  4.586e-03  -1.241 0.215517
free.agent.eligible1             1.287e+00  2.259e-01   5.699 3.05e-08 ***
free.agent.19911                 3.828e-01  6.575e-01   0.582 0.560914
arbitr.elgible.1                 1.038e+00  2.195e-01   4.726 3.63e-06 ***
arbitr.19911                    -1.024e+00  4.392e-01  -2.331 0.020443 *
run                              4.932e-02  2.905e-02   1.698 0.090682 .
hr                              -1.093e-01  7.208e-02  -1.516 0.130543
batting                         -1.814e-01  2.558e+00  -0.071 0.943522
obp                             -1.375e+00  2.253e+00  -0.610 0.542099
double                          -5.259e-02  4.489e-02  -1.172 0.242349
walk                             1.395e-02  9.757e-03   1.430 0.153808
stolen.base                     -1.685e-02  4.299e-02  -0.392 0.695372
triple                          -1.367e-01  1.600e-01  -0.854 0.393807
error                           -4.097e-03  6.879e-03  -0.595 0.552007
hit:free.agent.19911             8.248e-04  4.611e-03   0.179 0.858174
hit:arbitr.elgible.1             4.873e-03  6.448e-03   0.756 0.450395
hit:rbi                         -1.382e-04  7.709e-05  -1.792 0.074184 .
rbi:free.agent.eligible1         5.352e-03  9.555e-03   0.560 0.575855
rbi:arbitr.elgible.1            -3.384e-03  1.136e-02  -0.298 0.766072
rbi:arbitr.19911                 3.596e-02  2.179e-02   1.650 0.100046
hit:strike.out                   5.480e-06  5.446e-05   0.101 0.919917
strike.out:free.agent.eligible1 -2.570e-03  4.282e-03  -0.600 0.548890
strike.out:arbitr.elgible.1     -9.703e-04  5.234e-03  -0.185 0.853068
strike.out:run                   1.685e-04  1.246e-04   1.352 0.177345
strike.out:hr                   -3.088e-04  2.277e-04  -1.356 0.176229
hit:free.agent.eligible1        -1.359e-03  6.224e-03  -0.218 0.827363
free.agent.eligible1:run         1.248e-02  9.109e-03   1.370 0.171917
strike.out:free.agent.19911     -1.851e-02  5.974e-03  -3.099 0.002140 **
free.agent.19911:batting         7.076e-01  6.200e+00   0.114 0.909215
free.agent.19911:obp            -1.421e+00  3.952e+00  -0.360 0.719394
arbitr.elgible.1:run            -8.541e-03  8.773e-03  -0.974 0.331100
batting:double                   2.346e-01  1.609e-01   1.458 0.145884
run:obp                         -1.825e-01  7.492e-02  -2.436 0.015462 *
hr:obp                           3.687e-01  2.116e-01   1.742 0.082608 .
walk:stolen.base                -6.789e-05  1.557e-04  -0.436 0.663083
hit:arbitr.19911                -5.835e-03  7.084e-03  -0.824 0.410808
free.agent.eligible1:double     -1.151e-02  1.663e-02  -0.692 0.489334
arbitr.elgible.1:double          2.169e-03  1.938e-02   0.112 0.910985
strike.out:triple               -8.106e-04  6.023e-04  -1.346 0.179475
batting:triple                   5.179e-01  5.599e-01   0.925 0.355841
walk:triple                      8.755e-04  9.262e-04   0.945 0.345349
hit:hr                          -3.320e-04  2.626e-04  -1.264 0.207180
rbi:hr                           4.748e-04  3.015e-04   1.575 0.116414
free.agent.eligible1:hr          1.840e-02  2.313e-02   0.796 0.426972
free.agent.19911:hr              7.216e-02  1.889e-02   3.819 0.000165 ***
arbitr.elgible.1:hr              4.111e-02  2.803e-02   1.467 0.143564
arbitr.19911:hr                 -2.368e-02  4.647e-02  -0.510 0.610723
hit:walk                         3.173e-05  7.826e-05   0.405 0.685442
free.agent.eligible1:walk       -5.423e-03  4.984e-03  -1.088 0.277472
rbi:walk                        -7.569e-05  1.313e-04  -0.577 0.564598
rbi:stolen.base                  3.980e-05  1.605e-04   0.248 0.804409
strike.out:stolen.base          -2.611e-04  1.615e-04  -1.617 0.107004
batting:stolen.base              1.552e-01  1.434e-01   1.082 0.280020
arbitr.elgible.1:error           3.930e-03  1.390e-02   0.283 0.777495
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Residual standard error: 0.4925 on 280 degrees of freedom
Multiple R-Squared: 0.854,      Adjusted R-squared: 0.8248
F-statistic: 29.24 on 56 and 280 DF,  p-value: < 2.2e-16




More information about the R-help mailing list