[BioC] Design/Contrast Matrix for Two Channel Microarray

Joseph Shaw josph.sh at gmail.com
Thu Feb 6 02:14:36 CET 2014


Hi Gordon,

Thanks for your response - I believe it has cleared everything up.

So, for example, for experimental design (d), the simple saturated
direct design, we have the design matrix (1 0; 0 1; -1, -1).

The first coefficient represents B-A, hence the first row (1 0); the
second coefficient represents C-B, hence the second row (0 1) and
because the third row represents the third array (A-C), we have:

(-1 -1) = -(B-A)-(C-B) = -B+A-C+B = A-C

which is what we wanted. Is this correct?

I have one last question. In practice, is this approach identical to a
3x3 diagonal matrix (of ones) where each column represents and array
contrast?

More specifically:

1 0 0 ---> B-A
0 1 0 ---> C-B
0 0 1 ---> A-C

Joseph

On Wed, Feb 5, 2014 at 11:28 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
> Dear Joseph,
>
>> Date: Tue,  4 Feb 2014 18:11:50 -0800 (PST)
>> From: "Joseph Shaw [guest]" <guest at bioconductor.org>
>> To: bioconductor at r-project.org, josph.sh at gmail.com
>> Subject: [BioC] Design/Contrast Matrix for Two Channel Microarray
>>
>>
>> Hi all,
>>
>> Could somebody explain the process used in developing the design matrix
>> for two channel microarray experiments in Limma; in particular, those given
>> for each experiment in Figure 1 in the empirical Bayes paper
>> (http://www.statsci.org/smyth/pubs/ebayes.pdf).
>>
>> For single channel arrays, the design matrix seems to assume the form of
>> standard linear model design matrices; that is, 1 where an array treatment
>> is present and 0 otherwise. From here, the resulting model parameters can be
>> tested with the implementation of an appropriate contrast matrix (where,
>> typically, each contrast effect sums to zero). This does not appear to be
>> the case for two-channel experiments.
>>
>> In the above paper, the aforementioned experiments are given in Kerr and
>> Churchill arrow notation (where the arrow head points toward the RNA sample
>> labelled with red dye and the sample at the arrow base is labelled green).
>>
>> The experiments can be summarised as follows:
>>
>> (a)
>> Red     Green
>> RNA1  RNA2
>>
>> For this experiment, it seems to me that only parameter of interest (let's
>> call it mu1) is the response value (or mean of the response values if we
>> have more than one identical replicate); because the response is estimated
>> by the (mean of) the log2 fold change between red and green channels, in
>> this instance, the design "matrix" is simply (1); this becomes a column of 1
>> values if there is more than one identical replicate.
>>
>> (b)
>> Red     Green
>> RNA1  RNA2
>> RNA2  RNA1
>>
>> In this experiment, although there are two arrays, similarly to in
>> experiment (a), it seems that there is only one comparison of interest
>> (namely, the difference between RNA1 and RNA2); because the dyes in the
>> second array are inverted (relative to the first array in the experiment),
>> the ratio, too, is inverted. Inverting the term inside the logarithm will
>> yield a response which is the negative of the response from the first
>> replicate (i.e. log2(RNA2/RNA1) = -log2(RNA1/RNA2)); therefore, the second
>> replicate will yield the negative relative of the response from the first
>> replicate. For consistency, we must multiply the response value by -1. As a
>> result, we have the design matrix: (1, -1).
>>
>> I'm confused about how the design matrices are formed for experiments in
>> (c) and (d).
>>
>> In (c), RNA1 and RNA2 are compared through a common reference.
>>
>> (c)
>> Red:     Green:
>> Ref      RNA1
>> RNA1   Ref
>> RNA2   Ref
>>
>> The design matrix is given by (-1 0; 1 0; 1 1) -- where ";" denotes the
>> end of the matrix row; the first coefficient estimates the difference
>> between the RNA1 and the reference sample, whilst the second coefficient
>> estimates the the difference between RNA1 and RNA2.
>
>
> It isn't easy to explain how this design matrix was derived, but it is easy
> to confirm that it works.  Consider the third array for example, which
> estimates RNA2-Ref (Red minus Green).  As you say, the first coef is
>
>   coef1 = RNA1-Ref
>
> and the second is
>
>   coef2 = RNA2-RNA1
>
> The third array estimates
>
>   RNA2-Ref = coef1 + coef2
>
> Hence the two coefficients have to be c(1,1).
>
> You can easily compute these design matrices in limma.  Here is the code for
> Figure 1(c) in the paper:
>
>  > targets
>    Cy3 Cy5
>  1   A Ref
>  2 Ref   A
>  3 Ref   B
>  > parameters
>      AvsRef BvsA
>  Ref     -1    0
>  A        1   -1
>  B        0    1
>  > modelMatrix(targets,parameters=parameters)
>  Found unique target names:
>   A B Ref
>    AvsRef BvsA
>  1     -1    0
>  2      1    0
>  3      1    1
>
> Best wishes
> Gordon
>
>> Experiment (d) is a saturated direct design comparing three samples.
>>
>> (d)
>> Red     Green
>> B         A
>> A        C
>> C        B
>>
>> The design matrix is given by (1 0; 0 1; -1 -1); where the first
>> coefficient compares the difference between B - A and the second coefficient
>> compares the difference between C - B.
>>
>> Also, on page 39 of the Limma user guide
>> (http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf),
>> you can find a design and contrast matrix for a direct two-colour design.
>> The experiment compares CD4, CD8 and DN. I'm not really sure how this
>> design/contrast works.
>>
>> Explanation of the above structures would be greatly appreciated.
>>
>> Joseph
>>
>> -- output of sessionInfo():
>>
>> --
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:6}}



More information about the Bioconductor mailing list