[BioC] Probes vs Genes/Targets for Illumina BeadChips

Thu May 29 18:39:40 CEST 2008

Hello all,

I know the general question of "should I summarize/average/etc probes
that map to the same gene?" has been discussed many times before. But, I
feel that it might be slightly different on the Illumina platform (at
least for the Mouse chip, which is the one I have been using).

For non-control probes, there simply is no advantage to using probe
summarized data relative to target summarized data, since you basically
have the same number of distinct sequences. So, even though the probe
names have changed, and there appear to be ~70k of them, there are only
~46k different probe sequences, which just about map nicely to the
number of targets...

The numbers:

> length(as.list(lumiMouseV1TARGETID2NUID))
[1] 46116

> length(as.list(lumiMouseV1PROBEID2NUID))
[1] 70182

> length(unique(as.list(lumiMouseV1PROBEID2NUID)))
[1] 46120

Cheers,

Cei

sessionInfo()
R version 2.7.0 (2008-04-22)
i386-apple-darwin8.10.1

locale:
C

attached base packages:
[1] stats     graphics  grDevices datasets  tools     utils     methods
[8] base

other attached packages:
[1] lumiMouseV1_1.3.1     lumiMouseAll.db_1.2.0 AnnotationDbi_1.2.0
[4] RSQLite_0.6-8         DBI_0.2-4             lumi_1.6.0
[7] mgcv_1.3-30           affy_1.18.0           preprocessCore_1.2.0
[10] affyio_1.8.0          Biobase_2.0.0

-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.