[BioC] Double KO and timecourse design / limma

Wed Jan 9 00:16:15 CET 2013

Dear Julien,

Rows U and V of your targets frame seem to be the same CEL file.  Perhaps 
that is a mis-print.

The only real tutorial on how to go from experimental design to model 
design/contrast is the limma User's Guide.  My philosophy is that is the 
up to the biologist to decide what the questions of the interest are, and 
this includes deciding which samples it makes sense to compare.  The 
makeContrasts() syntax in limma gives you a way to compare any combination 
of conditions you like, and you seem to be well able to use that.

I can't comment on your proposed contrasts, because I don't know what you 
are trying to do.  They don't correspond to anything obvious to me.

I would have thought that if you want to test DKO=KOA+KOB, at time 0h for 
example, then you would simply test the contrast

    DKO_0h - (KOA_0h+KOB_0h)

because this contrast equal to zero is mathematically the same as DKO_0h = 
(KOA_0h+KOB_0h).

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
http://www.statsci.org/smyth

On Tue, 8 Jan 2013, Textoris Julien wrote:

> Le 08/01/2013 01:43, Gordon K Smyth a écrit :
>
>> 
>>> Date: Mon, 07 Jan 2013 08:29:17 +0100
>>> From: Textoris Julien <julien.textoris at gmail.com>
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] Double KO and timecourse design / limma
>>> 
>>> Dear all,
>>> 
>>> First thing, happy new year to all and thanks for all the
>>> knowledge/advices brought through the list !
>>> 
>>> I would like to have your advice/comments on the following design.
>>> 
>>> I have to analyse a microarray dataset performed on MoGene ST 1.0 which
>>> comprise 22 arrays. I have allmost no replicates (! could not control
>>> this !) but I thought I could use the timecourse and double KO/single KO
>>> status to overcome this ?
>>> 
>> 
>> Not sure what you mean.  You need replicates.  If you don't have them, then 
>> show us all 22 rows of the targets frame so we can see what else might be 
>> done.
>> 
>
> Hi,
>
> here is the targets description:
>
> id    FileName    ShortName    Treatment   Time KO    A-KO    B-KO
>
> A    AKO T 18H.CEL    KOA_18h   1    18    1    1    0
> B    AKO T 2H bis.CEL KOA_2h    1     2    1    1    0
> C    AKO T 2H.CEL     KOA_2h    1     2    1    1    0
> D    AKO T 6H.CEL     KOA_6h    1     6    1    1    0
> E    AKO UNT bis.CEL  KOA_0h    0     0    1    1    0
> F    AKO UNT.CEL      KOA_0h    0     0    1    1    0
> G    DKO 18H.CEL      DKO_18h   1    18    1    1    1
> H    DKO 2H.CEL       DKO_2h    1     2    1    1    1
> I    DKO 6H.CEL       DKO_6h    1     6    1    1    1
> J    DKO UNT.CEL      DKO_0h    0     0    1    1    1
> K    BKObis 18H.CEL   KOB_18h   1    18    1    0    1
> L    BKObis 2H.CEL    KOB_2h    1     2    1    0    1
> M    BKObis 6H.CEL    KOB_6h    1     6    1    0    1
> N    BKObis UNT.CEL   KOB_0h    0     0    1    0    1
> O    WT 18H.CEL       WT_UNT_18h  1    18    0    0    0
> P    WT 2H.CEL        WT_UNT_2h   1     2    0    0    0
> Q    WT 6H.CEL        WT_UNT_6h   1     6    0    0    0
> R    WT T 18H.CEL     WT_18h    1    18    0    0    0
> S    WT T 2H.CEL      WT_2h     1     2    0    0    0
> T    WT T 6H.CEL      WT_6h     1     6    0    0    0
> U    WT UNT.CEL       WT_0h     0     0    0    0    0
> V    WT UNT.CEL       WT_0h     0     0    0    0    0
>
>
>>> The variables are :
>>>  - Treatment : binary YES or NO
>>>  - Time : samples at 0h (= UNTREATED), 2h, 6h and 18h
>>>  - Strain : Wild type (WT), single KO for gene A or gene B, and double KO.
>>> 
>>> For WT mice, I have 2 UNTREATED (=0h), and a timecourse for Treatment 1
>>> and 0 (WT_UNT_2h, and WT_TTT_2h). For KO mice, I only have treated
>>> samples (except time 0h, which is considered untreated).
>>> 
>>> I have duplicates only for WT.UNTREATED_0h and KOA_0h.
>>> 
>>> I wrote the following design :
>>> 
>>> levels(f) =
>>> c("WT_0h","WT_2h","WT_6h","WT_18h","WT_UNT_2h","WT_UNT_6h","WT_UNT_18h","KOA_0h","KOA_2h", 
>>> ...,"DKO_0h","DKO_2h","DKO_6h","DKO_18h")
>>> 
>>> and performed the contrast matrix over time like his :
>>> 
>>> cont.wt = contrast.matrix(
>>>       "WT_18h-WT_6h",
>>>       "WT_6h-WT_2h",
>>>       "WT_2h-WT_0h",
>>>       levels=design)
>>> 
>>> idem for KOA, KOB and DKO, and then the comparisons :
>>> 
>>> cont.koa.wt = contrast.matrix(
>>>       "(KOA_18h-KOA_6h)-(WT_18h-WT_6h)",
>>>       etc...
>>> 
>>> 
>>> How would you handle the strain variable ? As one variable with four
>>> levels : WT, KOA, KOB, DKO ? Or is it possible to take into account that
>>> DKO is somehow like KOA+KOB ? Do I have to transform 'Strain' into three
>>> binary variables : WT (1/0), KOA(1/0) and KOB (1/0) and code DKO as KOA
>>> = 1 and KOB = 1 ?
>>> 
>> 
>> You have already incorporated the strain variable. There is no need to do 
>> anything else or to make any transformations.
>> 
>> You can test whether DKO is like KOA+KOB by testing the appropriate 
>> contrast.
>> 
>>> The second question is I don't know how to integrate the WT mice that
>>> are untreated ?
>>> 
>> 
>> By forming contrasts of interest.
>> 
>> Best wishes
>> Gordon
>> 
> I understand that I am lacking the basics. I sorry to ask this, but what is 
> the "appropriate" contrast ? I'm sure the answer is "the one that answers 
> your (biological) question", but if I always understand the simple examples 
> (eg, a single KO vs WT design), I have not been able to find a simple 
> tutorial explaining how to write the complex ones.
>
> For example, to compare DKO and KOA+KOB, I tested (without understanding what 
> i'm doing) the following solutions:
>
> cont.dif.dko2.ovt = makeContrasts(
>  "(DKO_2h-DKO_0h)-((KOB_2h+KOA_2h)-(KOB_0h+KOA_0h))",
>  "(DKO_6h-DKO_2h)-((KOB_6h+KOA_6h)-(KOB_2h+KOA_2h))",
>  "(DKO_18h-DKO_6h)-((KOB_18h+KOA_18h)-(KOB_6h+KOA_6h))",
>  levels=design)
> fit2 <- contrasts.fit(fit, cont.dif.dko2.ovt)
> fit2 <- eBayes(fit2)
> #topTableF(fit2, adjust="BH")
> res = decideTests(fit2,p.value=0.05,lfc=log2(1.5))
> ind = which( apply(res,1,function(x) {length(which(x != 0))>0}) == T)
> length(ind)
> #10
>
> cont.dif.dko3.ovt = makeContrasts(
>  "(DKO_2h-DKO_0h)-((KOB_2h*KOA_2h)-(KOB_0h*KOA_0h))",
>  "(DKO_6h-DKO_2h)-((KOB_6h*KOA_6h)-(KOB_2h*KOA_2h))",
>  "(DKO_18h-DKO_6h)-((KOB_18h*KOA_18h)-(KOB_6h*KOA_6h))",
>  levels=design)
> fit2 <- contrasts.fit(fit, cont.dif.dko3.ovt)
> fit2 <- eBayes(fit2)
> #topTableF(fit2, adjust="BH")
> res = decideTests(fit2,p.value=0.05,lfc=log2(1.5))
> ind = which( apply(res,1,function(x) {length(which(x != 0))>0}) == T)
> length(ind)
> #276
>
> cont.dif.dko4.ovt = makeContrasts(
>  "(DKO_2h-DKO_0h)-(KOB_2h-KOB_0h)",
>  "(DKO_2h-DKO_0h)-(KOA_2h-KOA_0h)",
>
>  "(DKO_6h-DKO_2h)-(KOB_6h-KOB_2h)",
>  "(DKO_6h-DKO_2h)-(KOA_6h-KOA_2h)",
>
>  "(DKO_18h-DKO_6h)-(KOB_18h-KOB_6h)",
>  "(DKO_18h-DKO_6h)-(KOA_18h-KOA_6h)",
>  levels=design)
> fit2 <- contrasts.fit(fit, cont.dif.dko4.ovt)
> fit2 <- eBayes(fit2)
> #topTableF(fit2, adjust="BH")
> res = decideTests(fit2,p.value=0.05,lfc=log2(1.5))
> ind = which( apply(res,1,function(x) {length(which(x != 0))>0}) == T)
> length(ind)
> #0
>
> -> I don't know how to choose between the three ! I would take the first, but 
> don't really know why ...
>
> Is there a good tutorial/blog to learn how to go from experimental design to 
> model design/contrast ?
>
> Thanks again,
>
> Julien
>
>
>>> This experimental design is a bit too complex for me, so any advice
>>> would be greatly appreciated !
>>> 
>>> Thanks in advance,
>>> 
>>> Julien
>>> 
>> 
>> 
>
> -- Envoyé de mon ENIAC Julien Textoris, MD, PhD Laboratoire d'immunologie, 
> UMR CNRS 7278, INSERM U1095 Faculté de Médecine Timone, Marseille +33 (0)4 91 
> 32 49 71 Service d'anesthésie et de réanimation Hôpital Nord, Marseille +33 
> (0)4 91 96 55 31
>
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:5}}