[BioC] control probes and missing annotations in Affy Mouse Gene 1.0 ST arrays

Robert Castelo robert.castelo at upf.edu
Thu Dec 3 18:39:39 CET 2009


dear list,

i'm working with the annotation package

mogene10sttranscriptcluster.db

for the Affy Mouse Gene 1.0 ST array and have the following two
questions. one is that if i want to use the nsFilter() function to
filter out control probes, with the older affy chips i'd something like:

filteredEset <- nsFilter(eset, feature.exclude="^AFFX")

but as far as i understand these newer chips do not have the prefix AFFX
in the control probe names. all probe names i find in the annotation
package above consist of numbers. so, i'd like to know if there is any
easy way with nsFilter of excluding the control probes.

the second questions is about failing to retrieve annotations for
specific probe names, for instance, if i do the following:

library(mogene10sttranscriptcluster.db)

mogene10sttranscriptclusterENTREZID[["10566258"]]
[1] NA

i.e., i don't find annotation for probe 10566258, however if i download
MoGene-1_0-st-v1.na30.mm9.transcript.csv from the NetAffx analysis
center and grep this probe name i obtain the following entry:

$ grep 10566258 MoGene-1_0-st-v1.na30.mm9.transcript.csv
"10566258","10566258","chr7","-","110975048","110976443","26","ENSMUST00000023934 // Hbb-b1 // hemoglobin, beta adult major chain // 7 E3|7 50.0 cM // 15129 /// AB364478 // Hbb-b2 // hemoglobin, beta adult minor chain // 7 E3|7 50.0 cM // 15130","ENSMUST00000023934 // ENSEMBL // Beta-globin gene:ENSMUSG00000052305 // chr7 // 100 // 100 // 26 // 26 // 0 /// AB364478 // GenBank // Mus musculus HBB2 mRNA for hemoglobin beta chain subunit, complete cds. // chr7 // 62 // 81 // 13 // 21 // 0 /// AB364477 // GenBank // Mus musculus HBB1 mRNA for hemoglobin beta chain subunit, complete cds. // chr7 // 52 // 81 // 11 // 21 // 0 /// AF071431 // GenBank // Mus musculus beta globin mRNA, partial cds. // chr7 // 40 // 19 // 2 // 5 // 0
[...etcetc]

i'd like to know if this is a simple syncronization issue between
NetAffx and this BioC annotation package and will get updated in the
next release (i'm using the current BioC devel version just in case) or
am i misusing the package and there is a way to retrieve the annotation?

thanks!!

sessionInfo()
R version 2.11.0 Under development (unstable) (2009-10-06 r49948) 
x86_64-unknown-linux-gnu 

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base     

other attached packages:
[1] mogene10sttranscriptcluster.db_4.0.1
org.Mm.eg.db_2.3.6                  
[3] RSQLite_0.7-3
DBI_0.2-4                           
[5] AnnotationDbi_1.9.2
Biobase_2.5.8



More information about the Bioconductor mailing list