[BioC] creating GSEA files using biomart

Juliet Hannah juliet.hannah at gmail.com
Thu Sep 13 17:29:06 CEST 2012


I am trying to create the GSEA chip file. This example uses Affy data,
and the chip file is already available. I'm
doing this as an exercise in preparation for other platforms.

The chip file should look like:

Probe Set ID	Gene Symbol	Gene Title
244901_at	ORF25	hypothetical protein
244902_at	NAD4L	NADH dehydrogenase subunit 4L
244912_at	CCB382	cytochrome c biogenesis orf382
244919_at	CCB203	cytochrome c biogenesis orf203
244925_at	NAD7	NADH dehydrogenase subunit 7

How can I obtain the third column from biomart. I tried searching the
attributes, but couldn't find the right name. Is it a matter of trial
and error to find the correct attribute, or
are there systematic ways to find it. Here is what I have so far:

probeSets <- c("219666_at", "220547_s_at", "218034_at")

ensembl = useMart("ensembl")
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)

idens <- getBM(attributes = c("affy_hg_u133a","hgnc_symbol"), filters
= "affy_hg_u133a",values = probeSets, mart = ensembl)

Also, does anyone have any suggestions regarding how to handle the
duplicates (seen in this example) with respect to GSEA.


Juliet Hannah

More information about the Bioconductor mailing list