[R] linear models and colinear variables...

Jonathan Baron baron at psych.upenn.edu
Fri Jul 2 04:43:36 CEST 2004

On 07/01/04 17:53, Peter Gaffney wrote:
>> When you do this, you are including all the
>> interaction terms.
>> The * indicates an interaction, as opposed to +.
>In this particular case I need to do exactly this;
>this is a study of antibiotic resistance - two of the
>variables respectively are type of bacteria and
>antibacterial agent. The evolutionary/epidemiological

Then you should use at most the interaction of every agent with
every germ.  Including all the interaction terms means that you
look at germ*germ and drug*drug interactions too.

>behavior of each pairing of these factors is
>different.  Can I remove some lower order terms; for
>example, if I get rid of Bugtype:Usage.level.ofdrug
>and Drugtype:Usage.level.of.drug will
>Bugtype:Drugtype:Usage.level.of.drug still be valid?

I don't think this example is "removing lower order terms," or
else I don't understand it.  I think it is what I was just
saying.  You would want something like (to use my terms),
germ1*drug1 + germ1*drug2 + ... + germN*drug(M-1) + germN*drugM.
Each of these would automatically include the relevant
first-order terms.  For example, germ1*drug1 would include germ1
and drug1 effects alone.  And I think you want those, if you are
really interested in the interaction.  Otherwise, what you think
is an interaction could just be a main effect.

But I really don't understand this setup.  It sounds like each
observation consists of a randomly chosen SET of germs and a SET
of drugs, so you can classify each data point in terms of the
presence or absence of each germ and the presence or absence of
each drug.  Is that it?  It isn't crazy, but it is unusual.

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:            http://www.sas.upenn.edu/~baron

More information about the R-help mailing list