[R] adding variable

Martin Wegmann mailinglist2_wegmann at web.de
Tue Nov 18 22:51:28 CET 2003


Ok I try to explain it clearer. 

I am not looking for step() add1() drop1() or similar commands. Nothing to do 
with variable selection.

I have two data frames, on with environmental variables and another one with 
animal data (let's say absence/presence of 10 species)

first I look which env. variables explain the presence of species 1

glm(species1~env.var1+env.var2+.....) -> glm.spec1

step(glm.spec1) -> glm.spec1.step

I get certain env. variables which have the biggest explanatory power.

Now I would like to treat the other absence/presence data of my species like 
env. variables which could influence the presence of species1
I included the env.variable from glm.spec1.step (I call them env.varX+...)

glm(species1~env.varX+......+species2) -> glm.species1.sp2


glm(species1~env.varX+......+species3) -> glm.species1.sp3


and this procedure shall be done for all remaining species.

I am looking for a method to add automatically each species2 up to species10 
and run glm(). 
The first part with the env. variables shall be kept as it is but the last 
variable (speciesX) shall be changed each time. I am looking for something 
like a placeholder and the command greps a different species from the species 
dataframe for each run and add it instead of the place holder.

I hope I explained it better. thanks Martin

On Tuesday 18 November 2003 22:14, Prof Brian Ripley wrote:
> Are you looking for something like add1 then?
>
> We do need a much clearer explanation of what you are trying to do to be
> able to help you: and not with y used in two separate senses!
>
> On Tue, 18 Nov 2003, Martin Wegmann wrote:
> > On Tuesday 18 November 2003 19:32, Prof Brian Ripley wrote:
> > > On Tue, 18 Nov 2003, Martin Wegmann wrote:
> > > > I have count data of animals (here y, y1, y2...) and env. variables
> > > > (x, x1, x2 ,....).
> > > >
> > > > I used a glm
> > > >
> > > > glm(y~x1+x2+x3....)
> > > >
> > > > glm(y1~x1+x2+x3....)
> > > >
> > > > and now I would like to add the count data of other species to
> > > > investigate if they might have a bigger impact than the env.
> > > > variables:
> > > >
> > > > #x? are the selected var from the first glm run
> > > >
> > > > glm(y~x?+x?+y1)
> > > >
> > > > glm(y~x?+x?+x?+y2)
> > > >
> > > > ....
> > > >
> > > > I wonder if there is a more elegant method to do this than adding
> > > > (and removing) each y by hand.
> > >
> > > Do you mean each x?  In either case, see ?update.
> >
> > update looks good but with update and with adding the y I have to do it
> > manually.
> >
> > I thought something like doing
> >
> > glm(y~x+x1+x2+....+y§)
> >
> > where y§ is: grep y1 out of df.y run glm and name it
> > grep y2 out of df.y run glm .....
> >
> > until all y's of df.y has been onced included in the model.
> > every time only one y§ has to be included
> >
> > the included x's have to be kept. I only want to look if one species
> > variables has more explanation power than the env. variables.
> >
> > perhaps this helps to understand what I am looking for:
> > I think bash scripts are not possible in R but it would look like such a
> > bash script for GRASS:
> >
> > for variable in y1 y2 y3  ....; do
> >
> > glm(y~x+x1+x2....+$variable)->glm.$variable
> > ; done
> >
> > #where $variable refers to the name of read in y's.
> >
> >
> > Martin




More information about the R-help mailing list