[BioC] Another limma factorial that needs controlling and pairing

Karl Brand k.brand at erasmusmc.nl
Mon Aug 16 17:58:14 CEST 2010


Hi Sabrina,

On 8/16/2010 3:33 PM, Sabrina AT Case wrote:
> Hi, Karl:
> Thank you for the detailed explanation. I need to understand the
> Treatment-contrast  parametrization since I always used the design
> matrix as what you did the first time.  I need to digest more about
> it.
>
Lacking statistical training, i devoted considerable time to both the 
theory and practice of this method of performing multifactorial ANOVA. 
Getting even just a little face to face time with a statistican can also 
be invaluable. Failing this i suggest searching/posting the general 
R-help forum for gaining theoretical and practical advice on the subject.

>   Just a quick comment. I noticed that you used
> decideTests(data, methods="global",adj="BH"...)
>   In Limma's user's guide $10.3, it mentioned that there is no theory
> prove about the combination of BH and global will correctly control
> the FDR for negatively correlated contrasts (though simulation is ok),
> I wonder if you have tried other ways, such as separate topTable etc.
> Does it make difference? Thanks!

Yes, i also tried method="nestedF" and "hierarchical" and "separate" 
(analagous to usin topTable() for each of the contrasts) out of interest 
and can confirm the results btwn methods were moderatley different with 
my data, but didn't change the main 'message' emerging from my gene lists.

Irrespective, I was careful in choosing method="global" as everything 
described in the user guide (and discussed in the Bio-C forum) indicated 
that this approach was most appropriate for my questions.

hth,

karl

>
> For some reason, I posted the same question as you suggested to the
> mailinglist, it never got posted, I hope this time it works,,,
>
> Sabrina
>
> On Wed, Aug 11, 2010 at 9:59 AM, Karl Brand<k.brand at erasmusmc.nl>  wrote:
>> Hi Sabrina,
>>
>> Yes, i believe i did figure out a contrast matrix which allowed me to use
>> limma to indentify DE genes between two factors of interest ('Tissue' and
>> 'Pperiod' in my case) whilst controlling for a third factor ('Time' in my
>> case where the samples were collected at different times of the day as part
>> of another question incorporated in the same experiment but not discussed
>> here). So to be clear, my goal was to find DE genes between all pairwise
>> comparisons of 6 data sets, namely, R.S, R.E, R.L, C.S, C.E and C.L and also
>> for any genes significant for interaction between 'Tissue' and 'Pperiod'.
>>
>> In my original post "[BioC] Another limma factorial that needs controlling
>> and  pairing" i'd outlined an approach based closly on the first example
>> discussed in Gorodn Smyth's chapter 23*, pg412. In the end this approach
>> didn't work for me. Instead i succeeded using the "treatment-contrast"
>> parametrization (pg414) example. You'll find a representation of the code i
>> used for my analysis below. I haven't run it, but for the purpose of
>> illustration, together with my targets file should suffice in explaining
>> exactly how i used limma. The most diffciult part for me was understanding
>> the way R sets up this "treatment-contrast" parametrization. To be sure
>> about how i was setting up the contrasts, i manually calculated the Log2FC
>> for each contrast and compared this with the limma output. This confirmation
>> was critical for a dilatante like myself to be sure my costrasts were indeed
>> what i thought they were. I would advise similar such corss checking of your
>> contrasts.
>>
>> As you'll see from my code, i used 'decide tests' since overall significance
>> of genes across all contrasts was of greater help to me in understanding the
>> biology. As to which correction method you use, there is plenty of online
>> info (BioC archives, limma user guide) to clarify this. In short, "global"
>> seemed the best balanced method for my question(s).
>>
>> Final note on controlling for the paring of tissues (being obtained from the
>> same animal): i omitted this for simplicty in the code below. In fact Prof.
>> Smyth clarified out to deal with this using duplicateCorrelation in a later
>> BioC post if this is relevant for you; and i can also repost if needed.
>>
>> hth,
>>
>> Karl
>>
>> *http://www.statsci.org/smyth/pubs/limma-biocbook-reprint.pdf
>>
>> ###not run
>> library(limma)
>> targets<- read.delim("RNA_Targets.txt")
>> Tissue<- factor(targets$Tissue, levels = c("R", "C"))
>> Pperiod<- factor(targets$Pperiod, levels = c("E", "L", "S"))
>> Time<- factor(targets$Time)
>>
>> design<- model.matrix(~Tissue * Pperiod + Time)
>> colnames(design)
>>
>> fit<- lmFit(rma.pp, design)
>> cont.matrix<- cbind(
>>
>> R.E_R.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
>> R.E_R.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
>> R.S_R.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
>>
>> C.E_C.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
>> C.E_C.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),
>> C.S_C.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,1),
>>
>> R.S_C.S = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
>> R.E_C.E = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
>> R.L_C.L = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),
>>
>> R.E_R.S__C.E_C.S = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),#int
>> R.E_R.L__C.E_C.L = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0))#int.2
>>
>> fit2<- contrasts.fit(fit, cont.matrix)
>> fit3<- eBayes(fit2)
>>
>> #write resutls of contrasts for 0.05
>> results<- (decideTests(fit3, method="global", adjust.method="BH",
>> p.value=0.05))
>> write.fit(fit3, results, file="w.int.glob.BH.0.05.csv", sep=",")
>>
>> #write resutls of contrasts for 0.01
>> results<- (decideTests(fit3, method="global", adjust.method="BH",
>> p.value=0.01))
>> write.fit(fit3, results, digits=10, file="w.int.glob.BH.0.01.csv", sep=",")
>>
>> ###end
>>
>> "RNA_Targets.txt":
>>
>> FileName    Tissue    Pperiod    Time    Animal
>> 01file.CEL    R    S    1    1
>> 02file.CEL    C    S    1    1
>> 03file.CEL    R    S    2    2
>> 04file.CEL    C    S    2    2
>> 05file.CEL    R    S    3    3
>> 06file.CEL    C    S    3    3
>> 07file.CEL    R    S    4    4
>> 08file.CEL    C    S    4    4
>> 09file.CEL    R    S    5    5
>> 10file.CEL    C    S    5    5
>> 11file.CEL    R    S    6    6
>> 12file.CEL    C    S    6    6
>> 13file.CEL    R    S    7    7
>> 14file.CEL    C    S    7    7
>> 15file.CEL    R    S    8    8
>> 16file.CEL    C    S    8    8
>> 17file.CEL    R    S    9    9
>> 18file.CEL    C    S    9    9
>> 19file.CEL    R    S    10    10
>> 20file.CEL    C    S    10    10
>> 21file.CEL    R    S    11    11
>> 22file.CEL    C    S    11    11
>> 23file.CEL    R    S    12    12
>> 24file.CEL    C    S    12    12
>> 25file.CEL    R    S    13    13
>> 26file.CEL    C    S    13    13
>> 27file.CEL    R    S    14    14
>> 28file.CEL    C    S    14    14
>> 29file.CEL    R    S    15    15
>> 30file.CEL    C    S    15    15
>> 31file.CEL    R    S    16    16
>> 32file.CEL    C    S    16    16
>> 33file.CEL    R    E    1    17
>> 34file.CEL    C    E    1    17
>> 35file.CEL    R    E    2    18
>> 36file.CEL    C    E    2    18
>> 37file.CEL    R    E    3    19
>> 38file.CEL    C    E    3    19
>> 39file.CEL    R    E    4    20
>> 40file.CEL    C    E    4    20
>> 41file.CEL    R    E    5    21
>> 42file.CEL    C    E    5    21
>> 43file.CEL    R    E    6    22
>> 44file.CEL    C    E    6    22
>> 45file.CEL    R    E    7    23
>> 46file.CEL    C    E    7    23
>> 47file.CEL    R    E    8    24
>> 48file.CEL    C    E    8    24
>> 49file.CEL    R    E    9    25
>> 50file.CEL    C    E    9    25
>> 51file.CEL    R    E    10    26
>> 52file.CEL    C    E    10    26
>> 53file.CEL    R    E    11    27
>> 54file.CEL    C    E    11    27
>> 55file.CEL    R    E    12    28
>> 56file.CEL    C    E    12    28
>> 57file.CEL    R    E    13    29
>> 58file.CEL    C    E    13    29
>> 59file.CEL    R    E    14    30
>> 60file.CEL    C    E    14    30
>> 61file.CEL    R    E    15    31
>> 62file.CEL    C    E    15    31
>> 63file.CEL    R    E    16    32
>> 64file.CEL    C    E    16    32
>> 65file.CEL    R    L    1    33
>> 66file.CEL    C    L    1    33
>> 67file.CEL    R    L    2    34
>> 68file.CEL    C    L    2    34
>> 69file.CEL    R    L    3    35
>> 70file.CEL    C    L    3    35
>> 71file.CEL    R    L    4    36
>> 72file.CEL    C    L    4    36
>> 73file.CEL    R    L    5    37
>> 74file.CEL    C    L    5    37
>> 75file.CEL    R    L    6    38
>> 76file.CEL    C    L    6    38
>> 77file.CEL    R    L    7    39
>> 78file.CEL    C    L    7    39
>> 79file.CEL    R    L    8    40
>> 80file.CEL    C    L    8    40
>> 81file.CEL    R    L    9    41
>> 82file.CEL    C    L    9    41
>> 83file.CEL    R    L    10    42
>> 84file.CEL    C    L    10    42
>> 85file.CEL    R    L    11    43
>> 86file.CEL    C    L    11    43
>> 87file.CEL    R    L    12    44
>> 88file.CEL    C    L    12    44
>> 89file.CEL    R    L    13    45
>> 90file.CEL    C    L    13    45
>> 91file.CEL    R    L    14    46
>> 92file.CEL    C    L    14    46
>> 93file.CEL    R    L    15    47
>> 94file.CEL    C    L    15    47
>> 95file.CEL    R    L    16    48
>> 96file.CEL    C    L    16    48
>>
>>
>>
>> On 8/10/2010 4:21 PM, sabrina wrote:
>>>
>>>
>>>
>>> Hi, Karl:
>>> I have a similar question as yours, did you figure it out since your last
>>> post?
>>>
>>> Other question I have is about using topTable. In your previous
>>> design, did you use topTable for each contrast, meaning for each
>>> particular
>>> contrast, how you find the top list of genes? If that is the case, what is
>>> the
>>> best way to sort the genes? Did you use B value or adj.p? Or you just used
>>> decideTests?
>>>
>>> Thanks
>>>
>>> Sabrina
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> Karl Brand<k.brand at erasmusmc.nl>
>> Department of Genetics
>> Erasmus MC
>> Dr Molewaterplein 50
>> 3015 GE Rotterdam
>> P +31 (0)10 704 3409 | F +31 (0)10 704 4743 | M +31 (0)642 777 268
>>

-- 
Karl Brand
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
T +31 (0)10 704 3457 | F +31 (0)10 704 4743 | M +31 (0)642 777 268



More information about the Bioconductor mailing list