[BioC] creating GO and KEGG gene set collections for E. coli

Clémentine Dressaire clementinedressaire at itqb.unl.pt
Fri Nov 19 13:14:00 CET 2010


Dear BioC users,



I would like to perform gene set analysis with gage analysis using the

gene I previously selected as differentially expressed in my condition of

interest. My first problem is that I am working with E. coli for which I

couldn't find neither kegg.gs or go.gs like files. 

I thus wanted to create my own gene set collection using GSEABase and the

following command lines where namesallexpr is the vectir of my 10208 probe

identifier. I tried to use this with different computers but I always

receive an error message (reported below) concerning memory problem. On my

own labtop I also have an additional warning about a lack of virtual

memory...

Does anyone has tricks or suggestions to circumvent this memory problem?

Unless it is a progarammation error? Does any one has already created such

a gene collection (I am interested both in GO and KEGG collection) for

ecoli2 affymetrix biochip and is ready to share it?





Many thanks for your help,



Clémentine





>         KEGGids=unique(mget(namesallexpr,ecoli2ENZYME,ifnotfound=NA))

>         KEGGids=KEGGids[-1]

>         KEGGids=as.character(KEGGids[-1])

>         

>     lst=as.list(ecoli2PATH)

>     gscKEGG = GeneSetCollection(mapply(function(geneIds,KEGGids) {

+     GeneSet(geneIds=namesallexpr, geneIdType=EntrezIdentifier(),

+             collectionType=KEGGCollection(KEGGids),

+             setName=KEGGids)}, lst, names(lst)))

Erreur : impossible d'allouer un vecteur de taille 40 Ko

De plus : Messages d'avis :

1: In names(object) <- NULL :

  Reached total allocation of 959Mb: see help(memory.size)

2: In initialize(value, ...) :

  Reached total allocation of 959Mb: see help(memory.size)

3: In initialize(value, ...) :

  Reached total allocation of 959Mb: see help(memory.size)

4: In initialize(value, ...) :

  Reached total allocation of 959Mb: see help(memory.size)





> sessionInfo()

R version 2.12.0 (2010-10-15)

Platform: i386-pc-mingw32/i386 (32-bit)



locale:

[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   

LC_MONETARY=French_France.1252

[4] LC_NUMERIC=C                   LC_TIME=French_France.1252    



attached base packages:

[1] grid      stats     graphics  grDevices utils     datasets  methods  

base     



other attached packages:

 [1] GSEABase_1.12.0       ecoli2cdf_2.6.0       XML_3.2-0.1          

gage_2.0.0            multtest_2.6.0       

 [6] ecoli2.db_2.4.7       org.EcK12.eg.db_2.4.6 simpleaffy_2.26.0    

gcrma_2.22.0          genefilter_1.32.0    

[11] annotate_1.28.0       biomaRt_2.6.0         GOstats_2.16.0       

graph_1.28.0          Category_2.16.0      

[16] annaffy_1.22.0        KEGG.db_2.4.5         GO.db_2.4.5          

RSQLite_0.9-2         DBI_0.2-5            

[21] AnnotationDbi_1.12.0  lattice_0.19-13       affy_1.28.0          

Biobase_2.10.0        limma_3.6.0          



loaded via a namespace (and not attached):

 [1] affyio_1.18.0         Biostrings_2.18.0     IRanges_1.8.0        

MASS_7.3-8            preprocessCore_1.12.0

 [6] RBGL_1.25.1           RCurl_1.4-4.1         splines_2.12.0       

survival_2.35-8       tools_2.12.0         

[11] xtable_1.5-6         

> 







-- 

Clémentine Dressaire

Post-doctoral research fellow

Control of gene expression lab

ITQB - Instituto de Tecnologia Química e Biológica

Apartado 127, Av. da República

2780-157 Oeiras

Portugal

+351 214469562



More information about the Bioconductor mailing list