[BioC] Question regarding handling technical replicates for Affy arrays

Tue May 30 02:53:50 CEST 2006

Dear Noel,

I'm sorry I didn't fully appreciate from your first email that you have only one replicate,
because you didn't give the whole biolrep vector.  A single replicate is simply not enough to
estimate the technical-replicate variance component.  You need at least two.  That is the reason
why duplicateCorrelation() returns a NaN answer.

I guess that most people would average the technical replicates or would choose the "best" one. 
It isn't likely to make a lot of difference.  There's no perfect solution because this isn't a
perfect experimental design.

Best wishes
Gordon

> Date: Mon, 29 May 2006 01:20:10 -0700 (PDT)
> From: "noel0925 at sbcglobal.net" <noel0925 at sbcglobal.net>
> Subject: Re: [BioC] Question regarding handling technical replicates
> 	for	Affy arrays
> To: bioconductor at stat.math.ethz.ch
>
>
> Hi Gordon,
>
> Thank you for your reply.
>
> Actually, I have looked at the consensus correlation
> and I obtain [1] NaN. This doesn't seem sensible.
>
> Perhaps I have specified the biological replicates
> incorrectly. The desciption of dupcor states that
> &quot;Typically the blocks are biological replicates and
> the repeated observations are technical replicates.&quot;
> As such, I thought that it made sense to create a
> vector of the replicates as follows:
>
>
> biolrep &lt;-
> c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39)
>
> Thus, there are 39 DIFFERENT RNA samples and 1 sample
> which is replicated (hybed to two different arrays).
> Where the number 5 is repeated twice since it is the
> only sample for which there is a technical replicate.
> Samples 1-20 are of RNA1, samples 21-30 are RNA2, and
> samples 31-40 are RNA3.
>
> Have I specified the biological replicates properly?
> The &quot;biolrep&quot; examples I have seen in the literature
> confused me a bit since it seems to specify both a
> block of biological replicates and techical reps
> within those blocks. But the cases given are for two
> color arrays for example, in section 23.5 of the Limma
> book chapter, the first example is for the case where
> two wt and two mut mice from the same strain are
> compared using two arrays for each pair so that the
> 1st and 2nd and 3rd and 4th are technical reps. So
> here,
> biolrep&lt;- c(1,1,2,2).
>
> This is different however from the Affy data I
> describe since the 3 different genotypes are all on
> separate arrays rather than both wt and mut on the
> same array.
>
> If I do:
> biolrep &lt;-
> c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39)
>
> then corfit$consensus yields NaN. Though the
> biological reps are not explicitly defined here, I
> would assume they are inferred from f&lt;-
> factor(targets$Target,levels = c(&quot;RNA1&quot;, &quot;RNA2&quot;,
> &quot;RNA3&quot;)).
>
>
> If I do:
> biolrep &lt;- c(rep(1,20), rep(2,10), rep(3,10))
> then corfit$consensus yields Inf and this does not
> indicate which arrays are technical reps.
>
> Any further insight you could offer would be great.
> Thanks very much,
>
> Noelle
>