[BioC] making gene sets for geneSetTest of Limma

Srinivas Iyyer srini_iyyer_bio at yahoo.com
Wed May 28 05:50:30 CEST 2008


Dear Martin, 
Thank you for your email. I should have explained
straight and simple. 


I want to associate my differentially expressed genes
( data from lmfit and ebayes) to 10 different gene
expression studies.  
What i have from those 10 studies is list of genes
that are differentially expressed.  Ive written them 
as the following tab-delim text file (.gmt file as per
GSEA)

Study1  NA  G1 G2 G3 G4...
Study2  NA   G10,G49, G432.....

I want to test these gene sets on my lmfit object
using geneSetTest().

In the example listed at
http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estrogen.html.

example;
geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both")

In this particular example, the author is testing the
list of genes with sorted t-statistics obtained from
lmfit object  on 'knownERgenesOnChip' set that has ~20
genes from that chip only.


However, In my case I want to test my lmfit object on
10 different sets. How do I make a smilar
'knownERgenesOnChip' gene set that will have all my 10
gene lists as sets. 

Simply I do not want to run 10 genesetTest's on each
gene set. I want to run only once. 


In the case of GSEA we make a .gmt file with each line
as one set.   

apologies for not writing my question as simple. 

Thank you .
srini




--- Martin Morgan <mtmorgan at fhcrc.org> wrote:

> Hi Srini --
> 
> Srinivas Iyyer wrote:
> > Dear group,
> >
> > i want to convert all gene sets available from
> GSEA
> > (C1,C2,C3 and C4) into a 4 different gene sets, so
> > that I can use geneSetTest of limma on above 4
> > different gene sets. 
> >   
> Hi Srini -- I'm not really sure what you want to do,
> or if what you want 
> to do makes sense. Here's my best guess at 'getting
> the job done', but 
> maybe others will give some more advice one whether
> it's actually a good 
> idea.
> 
> One possibility is to visit the Broad and download
> their entire database
> 
> http://www.broad.mit.edu/gsea/downloads.jsp
> 
> (look for 'XML database file'). You can then read
> these into R with
> 
>  > library(GSEABase)
>  > gss = getBroadSets('/path/to/msigdb_v2.5.xml')
> 
> you might then
> 
>  > collType = lapply(gss, collectionType)
>  > catType = sapply(collType, bcCategory)
>  > table(catType)
> catType
>   c1   c2   c3   c4   c5
>  386 1892  837  883 1454
> 
> to get the category of each gene set, and
> 
>  > c1sets = gss[catType=="c1"]
> 
> to select just the c1 sets.
> > In one of the examples (classic estrogen example,
> only
> > one set is described).  
> >   
> I'm not really sure what you are referring to (where
> is the example?) 
> and I'm not sure that geneSetTest will do what you
> want.
> 
> Say you've performed lmFit. You could make a vector
> of the relevant est 
> statistic, and make sure the vector elements have
> appropriate names, 
> e.g., converting probes to Symbol identifiers. Say
> that vector is x.
> 
> If you were interested in a particular gene set, say
> all genes in 
> chromosome band 4q27, you could select that
> 
>  > mySet = gss[["chr4q27"]]
> 
> You might then do something like
> 
>  > geneSetTest(geneIds(mySet), x, ...
> 
> (where ... might be addition arguments to
> geneSetTest). If you wanted to 
> test many sets, you might
> 
> lapply(gss[catType=="c1"], function(aSet) {
>     geneSetTest(geneIds(aSet), x, ...)
> })
> 
> (where again ... are additional arguments to
> geneSetTest).
> 
> The key is to get x to have names that are the same
> as the geneIds in 
> the gene sets. The mapIdentifiers function in
> GSEABase might help.
> 
> Hope this helps,
> 
> Martin
> > What kind of formats geneSetTest will read for
> gene
> > sets.  ( e.g. every set writting in a single line
> > format OR a list of lists for each gene set). 
> >
> > Could anyone suggest some steps to make gene sets
> for
> > genesettest. 
> >
> > thanks
> > Srini
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >   
> 
>



More information about the Bioconductor mailing list