[BioC] Affymetrix HuGene 2.0 ST annotation

Wed Jun 4 16:20:36 CEST 2014

Dear List,

I recently came across this post, that helped me in the analysis of data using this array. 
https://stat.ethz.ch/pipermail/bioconductor/2014-May/059408.html

However, I am concerned about the annotation and wondered if what I get is usual for this kind of array.

Code:
eset_mat <- as.matrix(Eset)
dim(eset_mat) #53617     6

library(annotate)
library(hugene20sttranscriptcluster.db)

annodb <- "hugene20sttranscriptcluster.db"
ID     <- featureNames(Eset)
Symbol <- as.character(lookUp(ID, annodb, "SYMBOL"))
Name   <- as.character(lookUp(ID, annodb, "GENENAME"))
Entrez <- as.character(lookUp(ID, annodb, "ENTREZID"))
Ensembl <- as.character(lookUp(ID, annodb, "ENSEMBL"))

annot = data.frame("ID"=ID,"Symbol"=Symbol,"Description"=Name,"EntrezID"=Entrez,"EnsemblID"=Ensembl)

length(which(Symbol != "NA")) # 23672 =====> is this normal?
length(Symbol))  # 53617
-----
Is it normal to get <50% annotation? 

(At present I have not done any filtering pre limma, used all 53K+ probes for DE).

Many Thanks,
Natasha 

 -- output of sessionInfo(): 

--

--
Sent via the guest posting facility at bioconductor.org.