[R] How to exclude insignificant intercepts using "step" function

David Winsemius dwinsemius at comcast.net
Tue Jun 23 06:45:25 CEST 2009


I think you should explain (to yourself primarily) what it means to  
have a non-significant intercept. If you can justify on a theoretic  
basis the exclusion of an intercept, then you may get more assistance.  
However, if you are just naively questing after some mythical concept  
of "significance", people may be less motivated to solve what most  
would consider to be an "insignificant" question.

-- 
DW
On Jun 22, 2009, at 11:38 PM, Chris Friedl wrote:

>
> I posted this question way down at teh end of another thread realted  
> to an
> error in step, but that was stupid since it really is another matter
> altogether. I should have posted it separately, as I have now done.
>
> The code below creates a data.frame comprising three marginally noisy
> surfaces. The code below (including a fix courtesy of David  
> Winsemius that
> avoids a step function error through use of the "by" function) returns
> significant coefficients for regressions based on factor grp. In the  
> case of
> grp A the intercept is not significantly different from zero.  
> However I
> can't get step to consider dropping the intercept term. In the  
> attached code
> the returned function for grp A is y ~ x1+x2. However I think it  
> should
> return y ~ -1 + x1 +x2 (or I guess y ~ 0 + x1 + x2). If I include -1  
> in the
> model designation then no intercept is returned for grps B and C. Am  
> I not
> using step properly? Or perhaps there is an alternative methodology  
> I could
> use. Any help appreciated.
>
> # y =      x1 +  x2          for grp A
> # y = 2 + 2x1 + 4x2          for grp B
> # y = 3 + 2x1 + 4x2 + 6x1x2  for grp C
> ind <- matrix(runif(200), ncol=2, dimnames=list(c(), c("x1","x2")))
> d1 <- data.frame(ind, y=ind[,"x1"]+ind[,"x2"]+rnorm(100,0,0.05),
> grp=rep("A",100))
> d2 <- data.frame(ind,
> y=2+2*ind[,"x1"]+4*ind[,"x2"]+rnorm(100,0,0.05),grp=rep("B", 100))
> d3 <- data.frame(ind,
> y=3+2*ind[,"x1"]+4*ind[,"x2"]+6*ind[,"x1"]*ind[,"x2"] 
> +rnorm(100,0,0.05),grp=rep("C",100))
> data2 <- rbind(d1,d2,d3)
> # Fit each surface by grp
> model <- y ~ x1*x2
> by(data2, data2$grp, function(x) {step(lm(model, data=x))})
>
> -- 


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list