[R] Random and fixed effect model with a covariate

Wed Jan 30 07:34:55 CET 2008

Dear Sophie,

> I wonder if anyone can please offer any advice on a model 
> including 2 fixed effects and 1 random effect, as well as a 
> covariate? 
> 
> The experimental design is as follows:
> I have a two by two factor design, where the two factors, Age 
> (A) and Group size (G), both have 2 levels (old or young, and 
> 1 or 3 respectively), and I am interested in the effect of 
> these factors upon a continuous Response variable (R).
> 
> The data is from social insects, so the above design is 
> repeated for several colonies. I think that Colony (C) should 
> be a random factor, as they were taken from a larger pool of 
> available colonies. Identification number (I) is different 
> for each individual.
> 
> Furthermore, I have a covariate, Mass (M), which could have 
> an effect upon R.

This sounds all quite reasonable.

> Do you know which of the below (if any) would be the most 
> appropriate model please?
> 
> model<-lme(R~A*G*M,random=~1|C/I)
> model<-lme(R~A*G+M,random=~1|C/I)

Well, this depends on your research question and whether or not interactions between A:M, A:G and A:G:M can be interpreted meaningfully, I would say.

> or is it necessary to state that A and G are factors as below:
> 
> model<-lme(R~as.factor(A)*as.factor(G)*M,random= ~1|C/I)
> model<-lme(R~as.factor(A)*as.factor(G)+M,random= ~1|C/I)

Yes, indeed, you need to specify factors as such. If they are factors it possibly makes sense to change their type in the data frame as many other functions (e.g. plotting functions) do different things with factors rather than continuous variables.

If your factor levels include letters then R automatically turns these variables into factors while you read the data. A two-level factor can be coded numerically with 0 and 1s instead of explicitly turning it into a factor variable.

> Do you also please have any advice upon the effect it would 
> have to change the order in which the factors and covariate 
> are placed in the model? When I tried moving them around in 
> the above models it changed the test statistics slightly. 

Well, the standard anova output follows a type I sum-of-squares logic, i.e. each addditional variable (lower down in the list) is tested for a significant improvement of the model so far. Thus, the seuquence has some importance.

If you use anova (..., type= 'marginal') you can follow a type III sum-of-squares logic, i.e. it is tested whether dropping a specific variable from the full model leads to a significant deterioration of the model or not.

In my perception, there are different schools of thought that either prefer the one or the other. Thus, perhaps you should use what is commonly used in your field.

Regards, Lorenz
- 
Lorenz Gygax, Dr. sc. nat., postdoc
Federal Veterinary Office
Centre for proper housing of ruminants and pigs
Agroscope Reckenholz-Tänikon Research Station ART
Tänikon, CH-8356 Ettenhausen / Switzerland