[BioC] Annotating limma results: Affymetrix probe IDs not mapping to hugene10stprobeset.db

James W. MacDonald jmacdon at med.umich.edu
Tue Jan 17 16:16:44 CET 2012


Hi Stephen

On 1/17/2012 9:58 AM, Stephen Turner wrote:
> I asked a similar question yesterday - wanted to clarify and give more
> information. I am using limma to analyze microarray data from Affymetrix
> HuGene 1.0 ST arrays. I'm reading in the CEL files using ReadAffy. Both
> sources of annotation confirm that I'm using the hugene1.0st array:
>
>> affybatch at cdfName
> [1] "HuGene-1_0-st-v1"
>> eset at annotation
> [1] "hugene10stv1"
>
> I fit a model, and now I want to annotate the results with gene symbols
> rather than the probeset IDs:
>
>> fit<- lmFit(eset, design)
>> head(fit$genes)
>         ID
> 1 7892501
> 2 7892502
> 3 7892503
> 4 7892504
> 5 7892505
> 6 7892506
>
> When I try to use getSYMBOL (as per Gordon's suggestion from a previous
> post:https://stat.ethz.ch/pipermail/bioconductor/2011-February/037866.html),
> none of these symbols map:
>
>> getSYMBOL(head(fit$genes$ID), "hugene10stprobeset.db")

You want the hugene10sttranscriptcluster.db package. By default oligo 
summarizes at the transcript level.

Best,

Jim


> 7892501 7892502 7892503 7892504 7892505 7892506
>       NA      NA      NA      NA      NA      NA
>
> In fact, of my 32,321 probeset IDs, only 150 match up with the IDs in the
> hugene10stprobeset.db package:
>> mapped_probes<- mappedkeys(hugene10stprobesetSYMBOL)
>> head(mapped_probes)
> [1] "7896741" "7896743" "7896745" "7896755" "7896757" "7896758"
>> length(fit$genes$ID)
> [1] 32321
>> length(mapped_probes)
> [1] 238111
>> sum(fit$genes$ID %in% mapped_probes)
> [1] 150
>
> Thanks in advance for any help!
>
> Stephen
>
>> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
>   [1] hugene10stv1probe_2.9.0     BiocInstaller_1.2.1
> hugene10stv1cdf_2.9.1       hugene10stprobeset.db_8.0.1
>   [5] org.Hs.eg.db_2.6.4          RSQLite_0.11.1              DBI_0.2-5
>                annotate_1.32.1
>   [9] AnnotationDbi_1.16.10       pvclust_1.2-2               calibrate_1.7
>                gplots_2.10.1
> [13] KernSmooth_2.23-7           caTools_1.12                bitops_1.0-4.1
>               gdata_2.8.2
> [17] gtools_2.6.2                limma_3.10.1
>   arrayQualityMetrics_3.10.0  affy_1.32.0
> [21] Biobase_2.14.0
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list