[BioC] Fw: random location of duplicate spots and use of limma

Gordon K Smyth smyth at wehi.EDU.AU
Fri Jan 7 15:21:36 CET 2005


> Date: Fri, 7 Jan 2005 10:11:30 +0100
> From: "Ingunn Berget" <ingunn.berget at umb.no>
> Subject: [BioC] Fw: random location of duplicate spots and use of
> 	limma
> To: <bioconductor at stat.math.ethz.ch>
>
> Hello
>
> There are approximately 6000 different genes on the arrays, there are two spots for each gene
> The duplicated spots have random location, which means that the number of spots between each
> duplicate is not the same for every gene. This is the summary for the distances:
>
>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>    4.00   32.00   71.00   86.59  135.00  244.00
>
> (Distance here means number of spots between the two duplicates)
>
> The function duplicateCorrelation in limma can be used to estimate correlation between
> within-array duplicates, the methodology is based on the assumption that duplicates are equally
> spaced. Since this assumption is not fulfilled here does this means that I cannot calculate the
> correlations and must take the average of the duplicates? Are there some functions to do this in
> limma or other BioC packages?

The are no functions in limma, or in other packages, to specifically handle this situation, either
to compute correlations or to take averages.

However, none of the duplicates on your arrays are very far apart.  It might be reasonable to
treat them as approximately equal distance.  Try sorting the data into gene ID order and then use
ndups=2 and spacing=1.  E.g.,

o <- order(MA$genes$ID)
MAsorted <- MA[o,]

This does assume that *every* probe occurs twice or an even number of times.  If this is not so,
you'll need to remove the exceptions first.

Gordon

> ------------------------------------------------------------------------------
> Ingunn Berget
> Norwegian University of Life Sciences
> Department of Animal and Aquaculture



More information about the Bioconductor mailing list