[BioC] where to find a list of housekeeping genes for Affy array (Gene 1.0 ST)

James W. MacDonald jmacdon at uw.edu
Fri Apr 19 15:39:29 CEST 2013



On 4/19/2013 9:25 AM, Robert Castelo wrote:
> hi,
>
> if you're searching from something readily available in R, you can try 
> to do following:
>
> library(BiocInstaller) ## assuming you installed some BioC package once
> biocLite("tweeDEseqCountData")
>
> library(tweeDEseqCountData)
> data(hkGenes)
> length(hkGenes)
> [1] 669
> head(hkGenes)
> [1] "ENSG00000149925" "ENSG00000102144" "ENSG00000142676" 
> "ENSG00000108298"
> [5] "ENSG00000144713" "ENSG00000075624"
>
> check out the help page of 'hkGenes' for the source publication of 
> this list.
>
> as you see, these are Ensmbl gene identifiers, but if you need Affy 
> IDs from a particular Affy chip, let's say HG-U133 Plus 2.0, you can 
> use the great identifier mapping functionality of the package 
> GSEABase, which in principle is designed to map identifier between 
> gene sets and ExpressionSet objects, but which you can tweak to do 
> this job for you passing the housekeeping gene list as if it were one 
> gene set:
>
> library(GSEABase)
>
> dummygs <- GeneSet(hkGenes, geneIdType=ENSEMBLIdentifier())
>
> hkGenesHGU133plus2AffyIDs <- geneIds(mapIdentifiers(dummygs, 
> AnnoOrEntrezIdentifier("hgu133plus2")))
>
> length(hkGenesHGU133plus2AffyIDs)
> [1] 1263
>
> head(hkGenesHGU133plus2AffyIDs)
> [1] "200966_x_at" "214687_x_at" "238996_x_at" "1558365_at"  "200737_at"
> [6] "200738_s_at"

Or you could use more direct methods:

library(hgu133plus2.db)
mapped.genes <- select(hgu133plus2.db, hkGenes, "PROBEID", "ENSEMBL")

head(mapped.genes)

           ENSEMBL     PROBEID
1 ENSG00000149925 200966_x_at
2 ENSG00000149925 214687_x_at
3 ENSG00000149925 238996_x_at
4 ENSG00000102144  1558365_at
5 ENSG00000102144   200737_at
6 ENSG00000102144 200738_s_at


Best,

Jim


>
>
> cheers,
> robert.
>
>
> On 04/18/2013 02:03 AM, Jack Luo wrote:
>> Not sure whether it's an appropriate question for Bioconductor. Is 
>> there a
>> place to find a list of housekeeping genes (identified by Affy)?
>>
>> Thanks,
>>
>> -Jack
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list