[BioC] about limma linear models

Thu May 19 05:13:58 CEST 2011

Dear Gang Jiang,

You say that each biological sample has two "biological" replicates. 
This isn't how I would put it, and I assume we have a conflict of 
terminology here.  I am going to assume that your experiment actually has 
two or three technical replicates (repeat hybridizations) of each of your 
five biological samples.  I'm going to assume that BI and BI2 are 
independent biological samples from the same population, as are BM and 
BM2.

Your experimental design doesn't lend itself to a fully satisfactory 
analysis.  This is best I can come up with:

First, redefine your targets frame to be:

  Cy3 Cy5 Replicate
   BI  BM     1
   BM  BI     1
   BI  BM     2
   BM  BI     2
   BE  BM     3
   BM  BE     3

Then define a design matrix:

   design <- modelMatrix(targets,ref="BM")
   design <- dbind(Dye=1,design)

Then estimate the correlation between technical dye-swaps:

   cor <- duplicateCorrelation(y,design,block=targets$Replicate)

Then fit your linear model:

   fit <- 
lmFit(y,design,block=targets$Replicate,correlation=cor$consensus.correlation)
   fit <- eBayes(fit)

Finally, compare BI to BM:

   topTable(fit,coef="BI")

or compare BE to BM:

   topTable(fit,coef="BE")

This analysis is not perfect, because it treats the last two occurences of 
BM as an independent biological sample, whereas they are actually repeats 
of BM2.  But it is the best I can come up with.  It should be good enough 
for most purposes, and it's better than other things you might do.

Best wishes
Gordon

On Thu, 19 May 2011, ½¸Õ wrote:

> Dear Gordon K Smyth:

> I'm sorry for labeling my samples by confusing names.BI and BI2 are 
> biological samples, not biological replicates, and array 1 and array 2 
> are biological reps with dye swap. So I have five biological samples, 
> each has two biological replicates.

> best wishes
> gang jiang

At 2011-05-18 06:29:39£¬"Gordon K Smyth" <smyth at wehi.EDU.AU> wrote:

>Dear Gang Jiang,
>
>It's hard to give you much help if you don't tell us what the sample names 
>(BI,BM,BI2,BM2) stand for.  In particular you need to tell us what are 
>biological replicates and what are technical.  If the same label appears 
>twice in your targets frame, it is just a technical replicate of the 
>first?
>
>Best wishes
>Gordon
>
>> Date: Tue, 17 May 2011 09:39:58 +0800 (CST)
>> From: ?? <sense_0109 at 126.com>
>> To: "Naomi Altman" <naomi at stat.psu.edu>
>> Cc: bioconductor <bioconductor at r-project.org>
>> Subject: Re: [BioC] about limma linear models
>>
>> Thank you for your reply. actually I have two biological replicates in 
>> my experiment, and I'm confident with the statistic power of limma with 
>> your information.
>>
>>
>> gang jiang
>>
>>
>>
>>
>> At 2011-05-17 00:15:14£¬"Naomi Altman" <naomi at stat.psu.edu> wrote:
>>
>>> If you have biological replicates, then using LIMMA is preferred with
>>> small sample sizes.  If you have only technical replicates, then you
>>> really cannot do a proper statistical analysis of the data.  Since
>>> you have a disconnected design, you might use separate channel
>>> analysis to simplify the comparisons you want to make.
>>>
>>> Regards,
>>> Naomi Altman
>>>
>>>
>>> At 07:40 AM 5/16/2011, =?GBK?B?va241Q==?= wrote:
>>>> Hello Everyone!
>>>>
>>>>
>>>> I'm now working with my expression microarray data by limma to
>>>> detect differential expression probes. I have biology as my
>>>> knowledge background ,not statistics, so I'm confused with the
>>>> design matrix and contrast matrix in the limma usersguide.  now i
>>>> have read the target file
>>>> as follows:
>>>> SlideNumber   FileName Cy3 Cy5
>>>> 1           1       15_1_3.txt  BI  BM
>>>> 2           2       15_1_4.txt  BM  BI
>>>> 3           3       18_1_2.txt BI2 BM2
>>>> 4           4       18_1_3.txt BM2 BI2
>>>> 5           5       16_1_1.txt BE2 BM2
>>>> 6           6       16_1_2.txt BM2 BE2
>>>>
>>>>
>>>> because there is no connection from BI(or BM) to the other samples,
>>>> dose that mean I have to contrast the differences(BI-BM, BI2-BM2,
>>>> BE2-BM2, BE2-BI2) separately?
>>>> Though I read the linear models and Empirical Bayes Methods theory
>>>> carefully, I only know little.  I wonder It is proper to detect
>>>> differential expression by limma when there are only two replicates?