[BioC] edgeR: generating a correct design matrix - multifactorial design

James W. MacDonald jmacdon at uw.edu
Thu Jul 26 17:02:53 CEST 2012


Hi Natasha,

On 7/26/12 8:29 AM, Natasha Sahgal wrote:
> Dear Prof. Gordon and List,
>
> I have an RNA-Seq expt for which I'd like to use edgeR, as it is multifactorial in design.
> Having gone through the user guide, I am a bit confused as to how to generate the model for my expt.
>
> The expt: 2 cell-lines (mut,wt), 2 conditions(stimulated, unstimulated), n=2 in each group.
> My aim: to detect DE genes based on the effect of stimulus on mut cells.
>
> Thus,
> dat
>    Sample Group Stim
> 1      1    WT   No
> 2      2    WT   No
> 3      3   WT+  Yes
> 4      4   WT+  Yes
> 5      5   Mut   No
> 6      6   Mut   No
> 7      7  Mut+  Yes
> 8      8  Mut+  Yes
>
> Now, if this were array data the model would be:
> design = model.matrix(~dat$Group)
> and whilst fitting the model I could make a contrast such as (Mut+ - Mut) - (WT+ - WT)
>
> I am not sure how to do this for the RNA-Seq data (i.e. what should the model be? And what coefficients should I pull out?)

You use the same exact design matrix you would have used for array data. 
Note that glmLRT() has a contrast argument, so if you parameterize the 
way you are doing here, you will need to use that (I am assuming you are 
just doing something like model.matrix(~Group), as including Stim 
doesn't make sense).

Alternatively you could parameterize by separating out the group from 
the stim and fit the interaction as part of the design matrix:

 > Group <- factor(rep(c("WT", "Mut"), each=4))
 > Group <- relevel(Group, "WT")
 > Stim <- factor(rep(c("No","Yes"), each=2, times=2))
 > model.matrix(~Group*Stim)
   (Intercept) GroupMut StimYes GroupMut:StimYes
1           1        0       0                0
2           1        0       0                0
3           1        0       1                0
4           1        0       1                0
5           1        1       0                0
6           1        1       0                0
7           1        1       1                1
8           1        1       1                1

And glmLRT() will test the interaction term by default.

Best,

Jim
> Whether the model should be:
> 1) model.matrix(~dat$Group) and somehow in the glmLRT function specify the above contrast in some manner?
>
> 2) model.matrix(~dat$Group+dat$Group*dat$Stim) (coefficient/contrast?)
>
> 3) model.matrix(~dat$Group*dat$Stim) (coefficient/contrast?)
>
>
> I'd appreciate any help and advice.
>
>
> Many Thanks,
> Natasha
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list