[R] glm syntax question

Bill.Venables at csiro.au Bill.Venables at csiro.au
Wed Apr 16 02:37:20 CEST 2008


Henrique Dallazuanna from Curitiba writes:

> 
> Try
> 
> On Tue, Apr 15, 2008 at 6:07 AM, maud <maud.july at gmail.com> wrote:
> 
[snip]
> >
> > (1) I need to be able to have my list of variables in the regression
> > be based from a variable vector instead of hard code.
> > (2) I need to be able to collect the significant variables from the
> > output of the regression in a vector.
> >
> > As example code consider:
> >
> > out <- glm(Var_0~Var_1+Var_2, family=binomial(link=logit),
> > data=MyData)
> >
[snip]
> >
> > For (1) I would like some code analogous to the following (which
> > doesn't work)
> >
> > VarVec <- c("Var_1","Var_2")
> > out <- glm(Var_0~VarVec, family=binomial(link=logit), data=MyData)
> 
> form <- as.formula(paste("Var_0", paste(VarVec, collapse="+"), sep = "
~ "))
> out <- glm(form, family=binomial(link=logit), data = MyData)

OK this works for fitting models, but it has a big drawback in that if
you get rid of the variable 'form' you cannot predict from it.  The
fitted model object comes in two bits and that's not a good idea.

You can repair things in a fairly ugly way, namely 

out$call$formula <- form

and the trap is unset once more.  This really points to the need for a
generic function, "formula<-", to make this a more natural and safer
operation itself.  It would not be needed all that often, though...

> >
> > For (2) I would like to be able to access the table shown above
(which
> > is only part of what summary(out) displays). I'd like something like
> >
> 
> SigVars <- summary(out)$coefficients[,4]
> SigVars[SigVars < .001]
> 

This is a bit picky, but I would use the column label to make clear
which one I wanted:

SigVars <- summary(out)$coefficients[, "P(>|z|)", drop = FALSE]
rownames(SigVars)[SigVars < 0.001]

The drop = FALSE ensures it stays as a matrix, keeping its row names,
as presumably you want to know which ones are significant by name.

> 
> >
> > table <- summary(out)
> > SigVars <- table[Pr(>|z|) < .001]
> >
> > that is collect all of the variables with a Pr(>|z|) value less than
.
> > 001.

Bill Venables.



More information about the R-help mailing list