[BioC] edgeR: design matrix for different condition

Gordon K Smyth smyth at wehi.EDU.AU
Fri Jul 27 01:31:56 CEST 2012


Dear KJ Lim,

Thanks for the rephrasing, which is clearer.  I would have liked you to 
mention however whether you read the documentation that I refered you do.

On Thu, 26 Jul 2012, KJ Lim wrote:

> Dear Prof Gordon,
>
> Good day. Thanks for your prompt replied.
>
> Please allow me to rephrase my previous question.
>
> This model,  ~tree+treat,  is assumed effect of the time of treatment
> will be the same irrespective of genotype.

Yes, this model makes that assumption.

> Thus, test for the coef=3 will gives me the differential expression at 
> 96H irrespective of genotype.

Actually coef=3 refers to 24H, according to the design matrix in your 
original email.  A test for coef=3 would test for DE at 24H vs 0H, 
irrespective of genotype.  But only if the assumption mentioned above is 
true, which is unlikely given the rest of your email.

> I would like to learn the differential expression of treeHS vs treeLS
> at specific time points i.e. 24H or if possible across the whole time
> points. Should I fit my model as
>
>  model A: ~tree*treat    OR   model B: ~tree+tree:treat ?

These models are equivalent, so it is just a matter of convenience which 
one you use, as I tried to explain in the limma documentation I refered 
you to.

> The design matrix columns of model B give:
>  "(Intercept)"  "treeLS"  "treeHS:treat03H"  "treeLS:treat03H"
> "treeHS:treat24H"  "treeLS:treat24H"  "treeHS:treat96H"
> "treeLS:treat96H"
>
> Am I doing right if I fit the model B and test for the coef=3:4 to
> learn the differential expression of treeHS vs treeLS at specific time
> points i.e. 03H? Does this test could tells me which set of genes
> up/down-regulated in treeLS or treeHS?

Yes.  Coef 3 tests for a 3H effect in HS.  Coef 4 tests for a 3H in LS. 
So testing for both coefficients coef=3:4 tests for a 3H effect in either 
treeLS or treeHS.

Best wishes
Gordon

> I'm not good in statistic and I'm learning,  kindly please correct me
> if I'm wrong.
>
> Thank you very much for your time and suggestion.
>
> Best regards,
> KJ Lim
>
>
>
>
> On 26 July 2012 03:39, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>
>> Dear KJ Lim,
>>
>> I don't quite understand your question, because you seem to be asking 
>> for something that isn't a test for differential expression, which is 
>> what edgeR does.  You question "I would like to learn the genes are 
>> express in treeHS but not treeLS and vice versa" seems to be about 
>> absolute expression levels.  edgeR doesn't test for genes that are not 
>> expressed in particular conditions.
>>
>> I'll refer you to the limma section on interaction models in case that 
>> helps, see Section 8.5 of:
>>
>> http://bioconductor.org/packages/2.11/bioc/vignettes/limma/inst/doc/usersguide.pdf
>>
>> Setting up a design matrix is the same for edgeR as it is for limma.
>>
>> Best wishes
>> Gordon
>>
>>> Date: Tue, 24 Jul 2012 17:12:00 +0300
>>> From: KJ Lim <jinkeanlim at gmail.com>
>>> To: Bioconductor mailing list <bioconductor at r-project.org>
>>> Subject: [BioC] edgeR: design matrix for different condition
>>>
>>> Dear the edgeR community,
>>>
>>> Good day.
>>>
>>> I'm analyzing my RNA-Seq experiment with edgeR. My study has 2
>>> different genotypes (treeHS, treeLS) and time of treatment (0H, 3H,
>>> 24H, 96H).
>>>
>>> I first assumed that the treatment effect will be the same for each
>>> genotype and I have the design matrix as:
>>>
>>>  design <- model.matrix(~tree+treat)
>>>
>>>> design
>>>
>>>   (Intercept)  treat03H  treat24H  treat96H  treeLS
>>> 1             1        0        0        0      0
>>> 2             1        0        0        0      0
>>> 3             1        1        0        0      0
>>> 4             1        1        0        0      0
>>> 5             1        0        1        0      0
>>> 6             1        0        1        0      0
>>> 7             1        0        0        1      0
>>> 8             1        0        0        1      0
>>> 9             1        0        0        0      1
>>> 10           1        0        0        0      1
>>> 11           1        1        0        0      1
>>> 12           1        1        0        0      1
>>> 13           1        0        1        0      1
>>> 14           1        0        1        0      1
>>> 15           1        0        0        1      1
>>> 16           1        0        0        1      1
>>>
>>> I used coef=4 to test for the differential expressions between treeHS
>>> and treeLS within the time of treatment, coef=3 to learn the
>>> differential expressions in 2 genotypes at time of treatment 96H.
>>>
>>> ** I would like to learn the genes are express in treeHS but not
>>> treeLS and vice versa at certain time of treatment let's say 24H or
>>> across the whole time of treatment, should I have the design matrix as
>>> below or more steps I need to do?
>>>
>>>  design <- model.matrix(~tree*treat)
>>>
>>> Kindly please light me on this. I appreciate very much for your help and time.
>>>
>>> Have a nice day.
>>>
>>> Best regards,
>>> KJ Lim

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list