[R] Analyses of covariation with lme() or lm()

Spencer Graves spencer.graves at pdf.com
Wed Oct 19 05:55:26 CEST 2005


	  Trial consumes 10 degrees of freedom as a fixed effect in m5 but only 
1 degree of freedom as a random effect in m1 and m3.  The 11 different 
Trials are NOT 11 different, completely unrelated entities wiht no 
relation to one another, as assumed by m5.  They are 11 different 
instances of more or less the same thing, with variations, as assumed by 
m1 and m3.

	  Models m2 and m4 are not plausible, because they assume you have 
3*4*11 independent observations.  This is almost certainly wrong, as 
substantiated by your comparisons of m1 with m2 and m3 with m4.  In 
fact, the concept of an independent observation breaks down in this 
context, because you have, in essence, at least 11 but definitely less 
than 3*4*11;  the exact number of "independent observations" to be used 
for an F test, etc., is not easy to define.  However, it seems clear 
that m5 makes no more sense than m2 and m4.

	  Does this help?

	  spencer graves

CG Pettersson wrote:
> Hello all!
> 
> I have a problem that calls for a better understanding, than mine, of 
> how lme() uses the random part of the call.
> 
> The dataset consists of eleven field trials (Trial) with three 
> replicates (Block) and four fertiliser treatments (Treat). Analysing for 
>   example yield with lme() is easy:
> 
> m1 <- lme(Yield ~ Treat, data=data,
>            random =~1| Trial/Block)
> 
> giving estimates of Treat effects with good significances. If I compare 
> m1 with the model without any random structure:
> 
> m2 <- lm(Yield ~ Treat, data=data),
> m1 is, naturally, much better than m2. So far so good.
> 
> Now I have one (1) measure from each Trial, of soil factors weather and 
> such, that I want to evaluate. Remember: only one value of the covariate 
> for each Trial. The suggestion I have got from my local guru is to base 
> this in m1 like:
> 
> m3 <- lme(Yield ~ Treat + Cov1 + Treat:Cov1, data=data,
>            random =~1| Trial/Block)
> 
> thus giving a model where the major random factor (Trial) is represented 
>   both as a (1) measure of Cov1 in the fixed part and by itself in the 
> random part. Trying the simpler call:
> 
> m4 <- lm(Yield ~ Treat + Cov1 + Treat:Cov1, data=data)
> 
> gives back basically the same fixed effects as m3, but with better 
> significances for Cov1. Tested with anova(m3,m4) naturally gives the 
> answer that m3 is better than m4. Ok, what about dealing with Trial in 
> the fixed call? :
> 
> m5 <- lm(Yield ~ Trial + Treat + Cov1 + Treat:Cov1, data=data)
> 
> lm() swallows this, but silently moves out Cov1 from the analysis, an 
> action that feels very logical to me.
> 
> My guru says that using the random call secures you from overestimating 
> the p-values of the covariate. I fear that the risk is as big that you 
> underestimate them with the same action. Working on a paper, I naturally 
> want to be able to do some sort of discussion on the impact of 
> covariates... ;-)
> 
> What is the wise solution? Or, if this is trying to make other people do 
> my homework, could anyone tell me where the homework is? (I´ve got both 
> Pinhiero & Bates and MASS as well as some others in the bookshelf.)
> 
> Cheers
> /CG
> 

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel:  408-938-4420
Fax: 408-280-7915




More information about the R-help mailing list