[R] Design matrix for species mixture
Margot Neyret
margotneyret at gmail.com
Fri May 12 10:20:47 CEST 2017
Hi Jim,
Sorry if my question was not clear. I will try to explain again…
I have one response variable Y, let’s say vegetation cover. Then I have my explanatory variable, let’s call it Crop. In my field I can have either Maize (m), Bean (b), Pumpkin (p) or mixtures : m+b, m+p, b+p, m+b+p. I also have a second explanatory variable X (e.g. soil moisture content).
So for now my variable Crop has 7 levels [m, p, b, mp, mb, pb, mpb]. If I want to compare Y between these crops, I write :
summary(lm(Y~Crop))
TukeyHSD(aov(Y~Crop))
And with X :
summary(lm(Y~Crop*X))
… etc.
Then I want to compare the effect of each individual crop. I can do as you suggested Jim : 3 variables M, B, P with values 0 or 1 if the crop is present, 0 otherwise and add interactions.
summary(lm(Y~M+B+P+M*P+…))
But here come my questions. This model seems to have 2 drawbacks.
1. How can I do pairwise comparisons here as I would do with Turkey test ? How can test the hypothesis, for instance, “Bean provides higher cover than Maize whatever the mixture is” ?
2. When it comes to interactions with other variables it gets quite complicated (also note that in real life I have 5 crops, not 3 and other explanatory variables) :
lm(Y~M*X+ B*X + P*X + M*B + M*P + P*B)
So, isn’t there a way to make it more concise ?
I hope it makes sens.
Thanks,
Margot
> On Friday, May 12, 2017 at 2:53 AM, Jim Lemon <drjimlemon at gmail.com (mailto:drjimlemon at gmail.com)> wrote:
> Hi Margot,
> I'm not sure I understand your model, but if I make up some data in
> which the response variable is vegetation cover and the three species
> are:
>
> A - eats one type of plant
> B - eats another type of plant
> C - preys on herbivorous insects
>
> df<-read.table(text="field,propveg,A,B,C
> 1,1,0,0,1
> 2,0.3,1,1,1
> 3,0.6,0,1,1
> 4,0.2,1,1,1
> 5,0.7,1,0,1
> 6,0.8,0,0,0
> 7,0.3,1,0,0
> 8,0.4,0,1,0
> 9,0.1,1,1,0
> 10,0.5,0,1,0
> 11,0.5,1,0,1
> 12,0.1,1,1,0
> 13,0.6,0,1,1
> 14,0,1,1,0",
> sep=",",header=TRUE)
> print(summary(lm(propveg~A+B+C+A:B+A:C+B:C,df)))
>
> Is that something like what you want?
>
> Jim
>
> On Fri, May 12, 2017 at 12:40 AM, Margot Neyret <margotneyret at gmail.com> wrote:
> > Hello,
> >
> > I have fields with species mixtures (for instance, species a, b, c, a+b, a+c, b+c), and I look at the effect of each species on a response Y. More specifically, I would like to compare the effect of individual species, either alone or in mixture.
> >
> > > Y = rnorm(18,0,1)
> > > mixture= rep(c('a','b', 'c', 'a+b', 'a+c', 'b+c'), each = 3)
> >
> > Thus I create variables A, B and C with :
> > - A = 1 when the mixture contains a (ie mixture = a or a+b or a+c); and 0 otherwise.
> > - Idem for variables C and B.
> >
> > > A = ifelse(mixture %in% c('a', 'a+b', 'a+c'), 1, 0)
> > > B = ifelse(mixture %in% c('b', 'a+b', 'b+c'), 1, 0)
> > > C = ifelse(mixture %in% c('c', 'a+c', 'b+c'), 1, 0)
> >
> > My plan was to build a design matrix from these 3 variables, that would then allow me to compare the effects of each species.
> >
> > > mm = model.matrix(~A+B+C+0)
> > > summary(lm(Y~mm))
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept) -0.8301 0.6221 -1.334 0.203
> > mmA 1.1636 0.4819 2.415 0.030 *
> > mmB 0.8452 0.4819 1.754 0.101
> > mmC -0.1005 0.4819 -0.208 0.838
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Residual standard error: 0.8347 on 14 degrees of freedom
> > Multiple R-squared: 0.4181, Adjusted R-squared: 0.2934
> > F-statistic: 3.353 on 3 and 14 DF, p-value: 0.04964
> >
> > My questions :
> > 1. Does this approach make any sense ? I have a feeling I am doing something strange but I cannot put my finger on it.
> > 1. My ddl are wrong, I should not have an intercept here, or at least my intercept should be one of my species. Should I just remove one species form the design matrix ?
> > 2. Is there any way to do post-hoc tests on my species now, as I would have done with Tukey test or lsmeans ?
> >
> > My objective afterwards is to add other explanatory variables and interactions in the model.
> >
> > Thanks in advance !
> >
> > M. N.
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list