[BioC] Limma-design matrix for technical replication

Wed Aug 13 04:52:48 CEST 2008

Dear Katerina,

Well, you're starting your microarray experience with a microarray design 
which is quite subtle.

The design and contrast that you give is sensible (it's recommended in the 
limma User's Guide for this sort of design), but you need to understand 
what it's testing.  You're testing for genes which are differentially 
expressed between these three D cells vs these three N cells, relative to 
technical variation.  This approach also allows you to compare the three 
biological replicates if you want.

However, you might be wanting to find genes which are statistically 
significant relative to biological variation, and this is harder.  In 
principle, the three biological replicates can be treated as blocks, but 
limma isn't smart enough to handle the dye-swaps and the blocking at the 
same time.

With your experiment, you could do it like this. First create a vector 
indicating your dye-swap pattern:

   dyeswap <- c(1,1,-1,-1,1,1,-1,-1,1,1,-1,-1)

Then unswap the M-values (I don't usually recommend this):

   MA2 <- MA
   MA2$M <- t(t(MA$M) * dyeswap)

Your design matrix is now very simple with all M-values lined up:

   design <- cbind(Dye=dyeswap,DvsN=1)

Note I've included probe-specific dye-effects here, which you may as well.

Then estimate the correlation within biological replicates:

   biolrep <- c(1,1,1,1,2,2,2,2,3,3,3,3)
   dupfit <- duplicateCorrelation(MA2,design,block=biolrep)
   dupfit$consensus

Check the correlation is positive. Then

   fit <- lmFit(MA2,design,block=biolrep,correlation=dupfit$consensus)
   fit <- eBayes(fit)
   topTable(fit,coef=2)

All the best
Gordon

> Date: Mon, 11 Aug 2008 17:06:02 +0200
> From: Kate?ina Kepkov? <kepkova at iapg.cas.cz>
> Subject: [BioC] Limma-design matrix for technical replication
> To: <bioconductor at stat.math.ethz.ch>
>
> Dear all,
> As a complete newbie to microarrays, I am trying to analyze experiment with
> following design: Two samples (differentiated versus undifferentiated cells)
> were compared directly on two-color oligo array, with 3 biological
> replicates (different cell sources) and 4 technical replicates (arrays) per
> biological replicate (12 arrays altogether). In every set of technical
> replicates two arrays are dye-swap. I am not sure how to handle the
> technical and biological replication when trying to fit linear model. We are
> interested just in overall comparison differentiated versus undifferentiated
> cells.
> I have arrived to following setup:
> Targets file is:
> SlideNumber	FileName	Cy3	Cy5
> 1	1.gpr	N1	D1
> 2	2.gpr	N1	D1
> 3	3.gpr	D1	N1
> 4	4.gpr	D1	N1
> 5	5.gpr	N2	D2
> 6	6.gpr	N2	D2
> 7	7.gpr	D2	N2
> 8	8.gpr	D2	N2
> 9	9.gpr	N3	D3
> 10	10.gpr	N3	D3
> 11	11.gpr	D3	N3
> 12	12.gpr	D3	N3
>
> Where N means undifferentiated and D differentiated cells and 1-3 are
> biological replicates.
>
> Is the following design correct one? Or is there a better way to obtain
> relevant information?
> Is this extensible for more/less biological replicates?
>
> design <- cbind(D1vsN1 = c(1,1,-1,-1,0,0,0,0,0,0,0,0), D2vsN2 =
> c(0,0,0,0,1,1,-1,-1,0,0,0,0), D3vsN3 = c(0,0,0,0,0,0,0,0,1,1,-1,-1))
> fit <- lmFit(MA, design)
> cont.matrix <- makeContrasts(DvsN = (D1vsN1 + D2vsN2 + D3vsN3)/3, levels =
> design)
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
>
>
> Sorry if I am asking something obvious and thank you in advance for your
> help.
>
> Best regards,
> Katerina
>
> ---------------------------------------------------------------------
> Katerina Kepkova
> Laboratory of developmental biology
> Department of Reproductive and Developmental Biology
> Institute of Animal Physiology and Genetics of the AS  CR, v.v.i.
> Rumburska 89, Libechov 277 21
> Czech Republic
> tel:     +420 315 639 534
> fax:     +420 315 639 510
> e-mail: kepkova at iapg.cas.cz