[BioC] Adding to GeneSetCollection object from function

Iain Gallagher iaingallagher at btopenworld.com
Sat Mar 12 14:44:41 CET 2011


Hello Martin.

I have been thinking / experimenting as well and came up with the following:

#toy code
library(GSEABase)

testList <- list('hsa-mir-451'=c('SATB2', 'MECP2', 'CTNNBIP1', 'SATB2'), 'hsa-mir-452'=c('SATB2', 'MEIS2', 'PRDM16', 'PRDM16'), 'hsa-mir-453'=c('SATB2', 'SNAI1', 'MECP2'))

n <- names(testList)
uniqueList <- lapply(testList, unique)# need unique values in list elements to make genesets

#make a function to create the sets
makeSet <- function(geneIds, n) {
         GeneSet(geneIds, geneIdType=SymbolIdentifier(), setName=n)
}

#apply the function to each element in the list and make a list of genesets
gsList <- gsc <- mapply(makeSet, uniqueList[], n)

#make the geneset collection
gsc <- GeneSetCollection(gsList)

This is based on the code in the GeneSetCollection help for a KEGG based geneset. 

I hadn't used mapply before and wrapping my head around what it was doing took some time.

Your code is shorter and neater. 

Thank you.

Iain

--- On Sat, 12/3/11, Martin Morgan <mtmorgan at fhcrc.org> wrote:

> From: Martin Morgan <mtmorgan at fhcrc.org>
> Subject: Re: [BioC] Adding to GeneSetCollection object from function
> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> Cc: "bioconductor" <bioconductor at stat.math.ethz.ch>
> Date: Saturday, 12 March, 2011, 12:40
> On 03/12/2011 02:14 AM, Iain
> Gallagher wrote:
> > oops, just correcting a typo in the code!
> > 
> > library(GSEABase)
> > 
> > testList <- list('hsa-mir-451'=c('SATB2', 'MECP2',
> 'CTNNBIP1'), 'hsa-mir-452'=c('SATB2', 'MEIS2', 'PRDM16'),
> 'hsa-mir-453'=c('SATB2', 'SNAI1', 'MECP2'))
> > 
> > geneSetFunc <- function(listIn)
> >     {
> >     l <- length(listIn)
> >     setNames <- names(listIn)
> > 
> > 
> >     for (i in 1:l) {
> >     gsTest <-
> GeneSet(unique(listIn[[i]]), geneIdType=SymbolIdentifier(),
> setName = setNames[i])
> >     }
> >     
> >     return(gsTest)
> > }
> 
> Hi Iain --
> 
> I created a list of GeneSets from your named list using Map
> (which is
> like mapply). Then I made the list of sets into a
> GeneSetCollection
> 
> geneSetFunc <- function(listIn)
> {
>     sets <- Map(GeneSet, listIn,
> setName=names(listIn),
>                
> MoreArgs=list(geneIdType=SymbolIdentifier()))
>     GeneSetCollection(sets)
> }
> 
> Hope that helps,
> 
> Martin
> 
> > 
> > test <- geneSetFunc(testList)
> > 
> > thanks
> > 
> > i
> > 
> > --- On Fri, 11/3/11, Iain Gallagher <iaingallagher at btopenworld.com>
> wrote:
> > 
> >> From: Iain Gallagher <iaingallagher at btopenworld.com>
> >> Subject: [BioC] Adding to GeneSetCollection object
> from function
> >> To: "bioconductor" <bioconductor at stat.math.ethz.ch>
> >> Date: Friday, 11 March, 2011, 21:36
> >> Hello list,
> >>
> >> I have written a small function to create GeneSets
> from a
> >> list object. As each GeneSet is generated I would
> like to
> >> add them to a GeneSetCollection object.
> >>
> >> How do I do this?
> >>
> >> #toy code
> >> library(GSEABase)
> >>
> >> testList <- list('hsa-mir-451'=c('SATB2',
> 'MECP2',
> >> 'CTNNBIP1'), 'hsa-mir-452'=c('SATB2', 'MEIS2',
> 'PRDM16'),
> >> 'hsa-mir-453'=c('SATB2', 'SNAI1', 'MECP2'))
> >>
> >> geneSetFunc <- function(listIn)
> >>     {
> >>     l <- length(listIn)
> >>     setNames <-
> names(listIn)
> >>     for (i in 1:l) {
> >>     gsTest <-
> GeneSet(unique(listIn[[i]]),
> >> geneIdType=SymbolIdentifier(), setName =
> names[i])
> >>
> >>
> >>     }
> >>
> >>     return(gsTest)
> >> }
> >>
> >> test <- geneSetFunc(testList)
> >>
> >> test2 <- GeneSetCollection(test)
> >>
> >> As it stands only the last GeneSet out makes the
> collection
> >> & as each output arrives in the collection it
> overwrites
> >> the previous.
> >>
> >> Thanks for any help / pointers.
> >>
> >> best
> >>
> >> i
> >>
> >>> sessionInfo()
> >> R version 2.12.2 (2011-02-25)
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >>
> >> locale:
> >>  [1] LC_CTYPE=en_GB.utf8   
> >>    LC_NUMERIC=C   
>    
> >>      
> >>  [3] LC_TIME=en_GB.utf8   
>    
> >> LC_COLLATE=en_GB.utf8    
> >>  [5] LC_MONETARY=C     
>    
> >>   
> LC_MESSAGES=en_GB.utf8   
> >>  [7] LC_PAPER=en_GB.utf8   
> >>    LC_NAME=C   
>    
> >>         
> >>  [9] LC_ADDRESS=C       
>    
> >>   LC_TELEPHONE=C   
>    
> >>    
> >> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
> 
> >>     
> >>
> >> attached base packages:
> >> [1] stats     graphics 
> grDevices
> >> utils     datasets 
> >> methods   base 
>    
> >>
> >> other attached packages:
> >> [1] GSEABase_1.12.1     
> graph_1.28.0 
> >>        annotate_1.28.0 
> >>    
> >> [4] AnnotationDbi_1.12.0 Biobase_2.10.0 
>    
> >>
> >>
> >> loaded via a namespace (and not attached):
> >> [1] DBI_0.2-5 
>    RSQLite_0.9-4
> >> tools_2.12.2  XML_3.2-0 
> >>    xtable_1.5-6 
> >>>
> >>
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> -- 
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> 
> Location: M1-B861
> Telephone: 206 667-2793
>



More information about the Bioconductor mailing list