[BioC] duplicate correlation on Agilent 4x44 arrays

Wed Apr 11 18:56:04 CEST 2007

On Tuesday 10 April 2007 08:07, Mitch Levesque wrote:
> Gordon,
>
> Thanks for the reply. I am not using any particular instruction set, just
> what I have put together from the User Guide.
>
> You were right about the file dimensions, they are different:
> > dim(RG)
>
> [1] 44407     4
>
> > gal <- readGAL()
> > dim(gal)
>
> [1] 180880     10
>
> Is it possible to read the duplicate positions directly off of the gal
> file? I tried:

 If you are thinking that the four different arrays represent "duplicates", 
then that probably isn't correct.  The "duplicates" in the sense of 
dupCorrelation are duplicate spots with the same sample hybridized to them; 
hybing the same sample multiple times on the same slide is not the typical 
use case (but perhaps you did do this?)  

There are not many duplicate spots on Agilent arrays unless you have an array 
design where this is the case.  I don't recall what you said about your array 
design, but unless there are duplicates of many thousands of probes out of 
the total of 44k probes within one array, using dupCorrelation is probably 
not warranted.  

> layout <- getLayout(gal, guessdups=TRUE)

The confusion here, I think, is in the fact that the GAL file is for the 
entire slide (which includes 4 arrays).  You need to not use the GAL file for 
these arrays and just get the information from the Agilent FE file, which 
read.maimages will load automatically with source='agilent'.  If there are 
other columns that you need, you can specify them directly from the 
read.maimages() function--see the documentation.  

Also, note that Agilent uses so-called orange-packed array designs, so the old 
idea of row/column doesn't translate perfectly, as each row is offset from 
the next.  Also, within a given array (and on the 4x44, there are four such 
arrays), there are no subarrays.  

> I haven't tried without the normexp, but I will test it. Thanks again.

Agilent uses a rather sophisticated background estimation method, so I agree 
with Gordon that there really isn't a need do more for these arrays.  You can 
read the technical manual for the platform for a full description of the 
algorithm (which I would encourage).  

Sean