[BioC] Limma: different numbers of duplicated spots

Gordon K Smyth smyth at wehi.EDU.AU
Fri Apr 6 01:13:09 CEST 2007


Dear Jeremy,

Personally, I'd treat all the genes as duplicated twice.  In this approach, the small group of
special genes which are actually duplicated 20 times would each be treated as 10 different genes.

Best wishes
Gordon

> Date: Wed, 4 Apr 2007 14:51:13 -0700
> From: "Jeremy Davis-Turak" <jeremydt at gmail.com>
> Subject: [BioC] Limma: different numbers of duplicated spots
> To: bioconductor at stat.math.ethz.ch
> Message-ID:
> 	<378b225b0704041451h72a7fbb3hc206614aec7cdc27 at mail.gmail.com>
> Content-Type: text/plain
>
> Hi BioC list,
>
> I am analyzing some new Agilent 4x44 C. Elegans arrays, and as our previous
> Agilent celegans arrays, there are 120 genes that are printed many (> 10)
> times.  However, now on each array everything is duplicated: Those 120 spots
> are printed 20 times (not 10), and all others are printed twice  (and one
> spot is printed 4 times...it probably was meant to be 2 different genes).
> As far as I can tell, the duplicated spots are randomly spaced.  I would
> like to use duplicateCorrelation on the normalized data, sorted by gene
> name, as described previously on the list:
>
> https://stat.ethz.ch/pipermail/bioconductor/attachments/20060123/4890ea8e/attachment.pl
>
> My only problem now is the spots that are replicated 20  times.  In the past
> I haven't dealt with them using very stringent statistics, since it was only
> 120 spots that I was dealing with (and maybe 1 gene in the group was
> differentially expressed).  Now however, since all 20K spots are duplicated,
> we need to take of the duplicates.  Clearly duplicateCorrelation is the
> simplest way to do this, but it won't work if we have 120 genes that are
> printed 20 times.
>
> My question is: how do I deal with these gens?  Could I just ignore those
> 120 genes for the calculation of the consensus correlation?   I read on this
> list that small numbers of genes won't affect this calculation: 120 /20K is
> less than 1% of the genes.
>
> If I do that, what would become of the 120 spots?  Can I somehow apply the
> same consensus correlation to them?
>
> What other solutions do people propose?
>
>
> Thanks in advance for your time.
>
> Jeremy Davis-Turak



More information about the Bioconductor mailing list