[BioC] Design/Contrast Matrix for Two Channel Microarray

Joseph Shaw [guest] guest at bioconductor.org
Wed Feb 5 03:11:50 CET 2014


Hi all,

Could somebody explain the process used in developing the design matrix for two channel microarray experiments in Limma; in particular, those given for each experiment in Figure 1 in the empirical Bayes paper (http://www.statsci.org/smyth/pubs/ebayes.pdf).

For single channel arrays, the design matrix seems to assume the form of standard linear model design matrices; that is, 1 where an array treatment is present and 0 otherwise. From here, the resulting model parameters can be tested with the implementation of an appropriate contrast matrix (where, typically, each contrast effect sums to zero). This does not appear to be the case for two-channel experiments.

In the above paper, the aforementioned experiments are given in Kerr and Churchill arrow notation (where the arrow head points toward the RNA sample labelled with red dye and the sample at the arrow base is labelled green).

The experiments can be summarised as follows:

(a)
Red     Green
RNA1  RNA2

For this experiment, it seems to me that only parameter of interest (let's call it mu1) is the response value (or mean of the response values if we have more than one identical replicate); because the response is estimated by the (mean of) the log2 fold change between red and green channels, in this instance, the design "matrix" is simply (1); this becomes a column of 1 values if there is more than one identical replicate.

(b)
Red     Green
RNA1  RNA2
RNA2  RNA1

In this experiment, although there are two arrays, similarly to in experiment (a), it seems that there is only one comparison of interest (namely, the difference between RNA1 and RNA2); because the dyes in the second array are inverted (relative to the first array in the experiment), the ratio, too, is inverted. Inverting the term inside the logarithm will yield a response which is the negative of the response from the first replicate (i.e. log2(RNA2/RNA1) = -log2(RNA1/RNA2)); therefore, the second replicate will yield the negative relative of the response from the first replicate. For consistency, we must multiply the response value by -1. As a result, we have the design matrix: (1, -1).

I'm confused about how the design matrices are formed for experiments in (c) and (d).

In (c), RNA1 and RNA2 are compared through a common reference.

(c)
Red:     Green:
Ref      RNA1
RNA1   Ref
RNA2   Ref

The design matrix is given by (-1 0; 1 0; 1 1) -- where ";" denotes the end of the matrix row; the first coefficient estimates the difference between the RNA1 and the reference sample, whilst the second coefficient estimates the the difference between RNA1 and RNA2.

Experiment (d) is a saturated direct design comparing three samples.

(d)
Red     Green
B         A
A        C
C        B

The design matrix is given by (1 0; 0 1; -1 -1); where the first coefficient compares the difference between B - A and the second coefficient compares the difference between C - B.

Also, on page 39 of the Limma user guide (http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf), you can find a design and contrast matrix for a direct two-colour design. The experiment compares CD4, CD8 and DN. I'm not really sure how this design/contrast works.

Explanation of the above structures would be greatly appreciated.

Joseph

 -- output of sessionInfo(): 

--

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list