# [R] Design matrix for species mixture

Margot Neyret margotneyret at gmail.com
Thu May 11 16:40:39 CEST 2017

```Hello,

I have fields with species mixtures (for instance, species a, b, c, a+b, a+c, b+c), and I look at the effect of each species on a response Y. More specifically, I would like to compare the effect of individual species, either alone or in mixture.

>Y = rnorm(18,0,1)
>mixture= rep(c('a','b', 'c', 'a+b', 'a+c', 'b+c'), each = 3)

Thus I create variables A, B and C with :
- A = 1 when the mixture contains a (ie mixture = a or a+b or a+c); and 0 otherwise.
- Idem for variables C and B.

>A = ifelse(mixture %in% c('a', 'a+b', 'a+c'), 1, 0)
>B = ifelse(mixture %in% c('b', 'a+b', 'b+c'), 1, 0)
>C = ifelse(mixture %in% c('c', 'a+c', 'b+c'), 1, 0)

My plan was to build a design matrix from these 3 variables, that would then allow me to compare the effects of each species.

> mm = model.matrix(~A+B+C+0)
> summary(lm(Y~mm))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.8301 0.6221 -1.334 0.203
mmA 1.1636 0.4819 2.415 0.030 *
mmB 0.8452 0.4819 1.754 0.101
mmC -0.1005 0.4819 -0.208 0.838
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8347 on 14 degrees of freedom
Multiple R-squared: 0.4181, Adjusted R-squared: 0.2934
F-statistic: 3.353 on 3 and 14 DF, p-value: 0.04964

My questions :
1. Does this approach make any sense ? I have a feeling I am doing something strange but I cannot put my finger on it.
1. My ddl are wrong, I should not have an intercept here, or at least my intercept should be one of my species. Should I just remove one species form the design matrix ?
2. Is there any way to do post-hoc tests on my species now, as I would have done with Tukey test or lsmeans ?

My objective afterwards is to add other explanatory variables and interactions in the model.