[BioC] duplicateCorrelation

Tue Nov 2 12:49:40 CET 2004

> Date: Tue, 02 Nov 2004 09:09:56 +0000
> From: Jason Skelton <jps at sanger.ac.uk>
> Subject: [BioC] duplicateCorrelation
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <41874EE4.90006 at sanger.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi All
>
> running duplicateCorrelation in limma
>
> cor <- duplicateCorrelation(nwaMA, ndups=2, spacing=840, design)
>
> I have three arrays with the following design
>                   a  b  c
> Array 1     1  0  0
> Array 2     0 -1  0
> Array 3     0  0 -1
>
> The layout is:
> $ngrid.r
> [1] 12
>  $ngrid.c
> [1] 4
>  $nspot.r
> [1] 21
>  $nspot.c
> [1] 20
>
> the correlation returned is 1 for every gene on the array ?

Yes, this is correct.  It has nothing to do with your data but everything to do with the fact that
you have tried to estimate three coefficients using only three arrays.  In other words you have no
replication and no degrees of freedom to estimate standard deviations.  (You are effectively
trying to do three t-tests, but each one with only one observation.  Clearly this is not
possible.)

Is it possible that you actually have three replicate arrays (with two dye-swaps), in which case
the design matrix should have only one column:

 design <- c(1,-1,-1) ?

Gordon

> the M & A values for any given set of duplicates across the array are
> similar
> and the original gpr files don't hint at anything obvious.
>
> Subsequent analysis with lmFit will work but ebayes won't, which makes
> sense ?
> as there are no degrees of freedom
>
> If anyone has any suggestions
>
> using R 1.9.1
> using limma 1.8.1
>
> many thanks
> Jason
>
> --------------------------------
> Jason Skelton
> Pathogen Microarrays
> Wellcome Trust Sanger Institute
> Hinxton
> Cambridge
> CB10 1SA
>
> Tel +44(0)1223 834244 Ext 7123
> Fax +44(0)1223 494919