[BioC] Romer and symbols2indices query

Loren Engrav engrav at u.washington.edu
Thu May 6 02:50:43 CEST 2010


Thank you for the response

Well maybe I don't, and maybe I shouldn't.  My thought was that tomorrow or
day after or ???  there will be a new version of the .gmt file and it would
be useful to just be able to quickly rerun things. But maybe that is faulty
logic, maybe the .gmt files do not change that often. And I thought the c2
set was curated. And then simple curiosity.  And I am also running GSEA,
GSA, GSEA in MEV so seemed best to keep the files similar.  However...

I have now run romer and romer2 at weeks 1 2 3 12 and 20 as below but have
not perused the results. And I have only ~2000 genes of interest so romer
does not take very long so can easily run again with the files you mention.

But I am searching for the files you mention in help(romer) and the 3 pdfs
and have missed them.  Where may I find them? Or are they the files Matthew
mentioned below at <http://bioinf.wehi.edu.au/software/MSigDB/index.html>?

I also have the other two questions in the "Romer warning serious? and
nrot=9999?" thread.

Thank you again for the response


=====================
romerDesign <- model.matrix(~
0+factor(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9,10,10,10)))

colnames(romerDesign) <- c("DW1", "YW1", "DW2", "YW2", "DW3", "YW3",
"DW12","YW12", "DW20", "YW20")

romerWkxcontrast.matrix <- makeContrasts(YWx-DWx, levels=romerDesign)

romerLR <- read.delim (file="romer1842LR.txt", header= FALSE, sep = "\t")

romerLRmatrix<- as.matrix (romerLR)

romerSymbols <- GSA1842symbolsvector

Broad_c2.all.v2.5.symbols.gmt <- getGmt("c2.all.v2.5.symbols.gmt",
collectionType=BroadCollection(category="c2"),
geneIdType=SymbolIdentifier())

Broad_c2.all.v2.5.symbols.gmtList <- geneIds(Broad_c2.all.v2.5.symbols.gmt)

names(Broad_c2.all.v2.5.symbols.gmtList) <-
names(Broad_c2.all.v2.5.symbols.gmt)

Broad_c2.all.v2.5.symbols.gmtIndices =
symbols2indices(Broad_c2.all.v2.5.symbols.gmtList, romerSymbols)

romerResultWkx <- 
romer(Broad_c2.all.v2.5.symbols.gmtIndices,romerLRmatrix,romerDesign,contras
t=romerWxcontrast.matrix,array.weights=NULL,block=NULL,correlation,floor=FAL
SE,nrot=1000)

romer2ResultWkx <- 
romer2(Broad_c2.all.v2.5.symbols.gmtIndices,romerLRmatrix,romerDesign,contra
st=romerWxcontrast.matrix,array.weights=NULL,block=NULL,correlation,nrot=100
0)







> From: Gordon K Smyth <smyth at wehi.EDU.AU>
> Date: Thu, 6 May 2010 09:26:50 +1000 (AUS Eastern Standard Time)
> To: Loren Engrav <engrav at u.washington.edu>
> Cc: Yifang Hu <hu at wehi.edu.au>, rbioc <bioconductor at stat.math.ethz.ch>
> Subject: [BioC] Romer and symbols2indices query
> 
> Dear Loren,
> 
> I don't understand why you would want to read in a gmt file from the Broad
> Institute rather than use the curated rdata files that we provide for use
> with romer.  The raw gmt files contain a mix of gene symbols of different
> species and a mix of official and non-official symbols.  So one can't
> expect to match the symbols you get from a raw gmt file to your own data
> with any reliability.  To construct the rdata files, we have carefully
> converted all gene aliases to official symbols and have mapped mouse to
> human and human to mouse orthologs.
> 
> This is the reason why we don't provide a read.gmt() function in limma, or
> a pre-made pipeline from the GSEABase read functions.  We don't want you
> to get unreliable results simply because the gene symbols haven't been
> curated.
> 
> Best wishes
> Gordon
> 
>> Date: Tue, 04 May 2010 07:43:46 -0700
>> From: Loren Engrav <engrav at u.washington.edu>
>> To: rbioc <bioconductor at stat.math.ethz.ch>
>> Subject: Re: [BioC] Romer and symbols2indices query
>> Message-ID: <C80580B2.27D83%engrav at u.washington.edu>
>> Content-Type: text/plain; charset="US-ASCII"
>> 
>> Thank you, got it
>> 
>> Downloading rdata objects saves reading them into an rdata object, cool
>> 
>> But for interest, in R/GSA there is
>> GSA.read.gmt(filename.gmt) to read in a .gmt file
>> 
>> Does limma or romer have an equivalent function?
>> 
>> 
>>> From: Matthew Ritchie <mritchie at wehi.EDU.AU>
>>> Date: Tue, 4 May 2010 14:44:23 +1000 (EST)
>>> To: Loren Engrav <engrav at u.washington.edu>
>>> Cc: rbioc <bioconductor at stat.math.ethz.ch>
>>> Subject: Re: [BioC] Romer and symbols2indices query
>>> 
>>> Dear Loren,
>>> 
>>> You can find rdata objects of the Broad's MSigDB gene sets at
>>> 
>>> http://bioinf.wehi.edu.au/software/MSigDB/index.html
>>> 
>>> You are right, the 'symbols' argument in the function symbols2indicies()
>>> are the gene symbols corresponding to the probes from your microarray
>>> data.
>>> 
>>> For example, to use the human C2 collection, download the rdata file, then
>>> run the following.
>>> 
>>> load("human_c2.rdata")
>>> c2 = symbols2indices(Hs.gmtl.c2, symbols)
>>> 
>>> (this assumes 'symbols' is a vector containing the gene symbols from your
>>> array data)
>>> 
>>> Best wishes,
>>> 
>>> Matt
>>> 
>>>> Have done GSEA and GSA for set enrichment and am setting out to try romer
>>>> and have probably "simple" question
>>>> 
>>>> To get the Broad set into a list of indices there is
>>>> symbols2indices(gmtl.official, symbols) but
>>>> 
>>>> 1)how do I get the Broad set into gmtl.official? And
>>>> 2)is symbols a vector of MY probe sets of interest?
>>>> 
>>>> I checked gmane and found only one comment about romer
>>>> Also checked limma reference pdf
>>>> 
>>>> Thank you
> 
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:7}}



More information about the Bioconductor mailing list