[BioC] making gene sets for geneSetTest of Limma

Martin Morgan mtmorgan at fhcrc.org
Wed May 28 05:20:11 CEST 2008


Hi Srini --

Srinivas Iyyer wrote:
> Dear group,
>
> i want to convert all gene sets available from GSEA
> (C1,C2,C3 and C4) into a 4 different gene sets, so
> that I can use geneSetTest of limma on above 4
> different gene sets. 
>   
Hi Srini -- I'm not really sure what you want to do, or if what you want 
to do makes sense. Here's my best guess at 'getting the job done', but 
maybe others will give some more advice one whether it's actually a good 
idea.

One possibility is to visit the Broad and download their entire database

http://www.broad.mit.edu/gsea/downloads.jsp

(look for 'XML database file'). You can then read these into R with

 > library(GSEABase)
 > gss = getBroadSets('/path/to/msigdb_v2.5.xml')

you might then

 > collType = lapply(gss, collectionType)
 > catType = sapply(collType, bcCategory)
 > table(catType)
catType
  c1   c2   c3   c4   c5
 386 1892  837  883 1454

to get the category of each gene set, and

 > c1sets = gss[catType=="c1"]

to select just the c1 sets.
> In one of the examples (classic estrogen example, only
> one set is described).  
>   
I'm not really sure what you are referring to (where is the example?) 
and I'm not sure that geneSetTest will do what you want.

Say you've performed lmFit. You could make a vector of the relevant est 
statistic, and make sure the vector elements have appropriate names, 
e.g., converting probes to Symbol identifiers. Say that vector is x.

If you were interested in a particular gene set, say all genes in 
chromosome band 4q27, you could select that

 > mySet = gss[["chr4q27"]]

You might then do something like

 > geneSetTest(geneIds(mySet), x, ...

(where ... might be addition arguments to geneSetTest). If you wanted to 
test many sets, you might

lapply(gss[catType=="c1"], function(aSet) {
    geneSetTest(geneIds(aSet), x, ...)
})

(where again ... are additional arguments to geneSetTest).

The key is to get x to have names that are the same as the geneIds in 
the gene sets. The mapIdentifiers function in GSEABase might help.

Hope this helps,

Martin
> What kind of formats geneSetTest will read for gene
> sets.  ( e.g. every set writting in a single line
> format OR a list of lists for each gene set). 
>
> Could anyone suggest some steps to make gene sets for
> genesettest. 
>
> thanks
> Srini
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list