[BioC] Incomplete EntrezID annotations for the Mouse 430 v2.0 probe-set

Vincent Carey stvjc at channing.harvard.edu
Wed Nov 3 03:46:12 CET 2010


On Tue, Nov 2, 2010 at 2:14 PM, ANJAN PURKAYASTHA
<anjan.purkayastha at gmail.com> wrote:
> Hi,
> I have run into the following problem. I created a probeID-EntrezID mapping
> for the Affy mouse array from the cognate annotation file Mouse4302.db.
> Unfortunately about 10000 genes do not have corresponding EntrezID.

What do you mean by "10000 genes"?

The following shows that 7688 probesets do not have Entrez ID mappings
(using current packages).

> length(ls(mouse4302ENTREZID))
[1] 45101
> length(setdiff(ls(mouse4302ENTREZID), mappedkeys(mouse4302ENTREZID)))
[1] 7688

That's just a fact of life.

> Many of these are genes with known functions. If I cannot map a EntrezID to
> these then I cannot retrieve GO annotations and consequently I cannot do a
> Gene Set Enrichment analysis using GOstats.

This is not really correct.  You can use whatever groupings and
mappings you like with GOstats.  See the
GOstatsForUnsupportedOrganisms for extensive details on dealing
with a somewhat more difficult situation.  When you say the genes have
"known functions", perhaps you can use that knowledge to provide GO
associations for the unmapped genes, or, if the functions you refer to
do not have names in GO, you can create your own functional grouping
of genes.

> Does anyone have an update annotation file?

Your sessionInfo shows that you are not using the current version of
R, but that is not the main concern.  If you have gene:GO mappings and
gene sets that you prefer to those available through the annotation
packages, you can use those mappings and sets to drive the GOstats
analysis.

My sessionInfo:

R version 2.12.0 Patched (2010-10-15 r53331)
Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices datasets  tools     utils     methods
[8] base

other attached packages:
[1] mouse4302.db_2.4.5   org.Mm.eg.db_2.4.6   RSQLite_0.9-2
[4] DBI_0.2-5            AnnotationDbi_1.11.9 Biobase_2.10.0
[7] weaver_1.15.0        codetools_0.2-2      digest_0.4.2

> Many thanks in advance,
> Anjan
>
> --
> ===================================
> anjan purkayastha, phd.
> research associate
> fas center for systems biology,
> harvard university
> 52 oxford street
> cambridge ma 02138
> phone-703.740.6939
> ===================================
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list