[BioC] limma design and contrast matrix for paired experiment

David Westergaard david at harsk.dk
Fri Jan 11 18:31:23 CET 2013


Hello,

I am analysing microarray data performed on two cell cultures, in
which the gene expression were measured before (C) and after treatment
(T), so that the targets look like this:

Cell-line	Treatment	Sample
Cell 1	C	1
Cell 1	T	1
Cell 2	C	2
Cell 2	T	2
Cell 1	C	3
Cell 1	T	3
Cell 2	C	4
Cell 2	T	4
Cell 1	C	5
Cell 1	T	5
Cell 2	C	6
Cell 2	T	6

All experiments were performed on single-channel Agilent arrays, with
4 samples pr. slide. I am interested in determining the differentially
expressed genes between Cell1 before and after treatment, as well as
Cell2 before and after treatment. This is the preliminary code:

# Load and normalize data
RG <- read.maimages(targets$FileName,source="agilent.median",green.only=TRUE)
# Assume there is a col called FileName in the targets section
RG <- backgroundCorrect(RG, method="normexp", offset=16)
RGNorm <- normalizeBetweenArrays(RG, method="quantile")
RGNorm.ave <- avereps(RGNorm, ID=RGNorm$genes$ProbeName)

# Create design
Pairing <- paste(rep(c('C1-','C1-','C2-','C2-'),3),c(1,1,2,2,1,1,2,2,1,1,2,2),rep(c('C','T'),6),sep='')
pair <- factor(Pairing,levels=unique(Pairing))
design <- model.matrix( ~ 0 + pair )
colnames(design) <- c('C1.C','C1.T','C2.C','C2.T')

# Fit data
fit <- lmFit(RGNorm.ave, design=design)

cont.matrix <- makeContrasts(C1 = C1.T-C1.C, C2=C2.T-C2.C, levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)


For the experiment, is the design/contrast matrix a proper choice to
answer the questions of 'Which genes are differentially expressed in
Cell 1' and  'Which genes are differentially expressed in Cell 2'?
Further, should I do any technical correction, such as
duplicateCorrelation or similar? The reason I am asking is that even
at p<=0.01 I am getting a very high number of differentially expressed
probes (4500ish for Cell 1, and 7500ish for Cell 2, respectively), and
I want to make sure this is biological significance, and not some
technical aspect I have missed.

Thanks in advance.

Best,
David


> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] statmod_1.4.16 limma_3.10.3

loaded via a namespace (and not attached):
[1] tcltk_2.14.1 tools_2.14.1



More information about the Bioconductor mailing list