[R] Varying statistical significance in estimates of linear model

Stathis Kamperis ekamperi at gmail.com
Fri Aug 9 22:20:09 CEST 2013


For archiving reasons:

1. "Practical Regression and Anova using R" by Faraway
2. Possible reason: multi-collinearity in predictor variables.

Thanks everybody!

On Thu, Aug 8, 2013 at 1:43 PM, Stathis Kamperis <ekamperi at gmail.com> wrote:
> Hi everyone,
>
> I have a response variable 'y' and several predictor variables 'x_i'.
> I start with a linear model:
>
> m1 <- lm(y ~ x1); summary(m1)
>
> and I get a statistically significant estimate for 'x1'. Then, I
> modify my model as:
>
> m2 <- lm(y ~ x1 + x2); summary(m2)
>
> At this moment, the estimate for x1 might become non-significant while
> the estimate of x2 significant.
>
> As I add more predictor variables (or interaction terms), the
> estimates for which I get a statistically significant result vary. So
> sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.
>
> It seems to me that I could tweak my model in such a way (by
> adding/removing predictor variables or "suitable" interaction terms)
> that I could "prove" whatever I'd like to prove.
>
> What is the proper methodology involved here ? What do you people do
> in such cases ? I can provide the data if anyone cares and would like
> to have a look at them.
>
> Best regards,
> Stathis Kamperis



More information about the R-help mailing list