[BioC] Re : Cox Model

J.J.Goeman at lumc.nl J.J.Goeman at lumc.nl
Thu Feb 14 12:10:14 CET 2008


 Dear Eleni,

If you are interested in prediction of survival with (a subset of) your 18000 genes, you may want to have a look at the "penalized" package on CRAN (http://cran.us.r-project.org/src/contrib/Descriptions/penalized.html) or other packages there that do penalized estimation.

Jelle
 

> -----Original Message-----
> From: Eleni Christodoulou [mailto:elenichri at gmail.com] 
> Sent: 13 February 2008 13:22
> To: phguardiol at aol.com
> Cc: rdiaz at cnio.es; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Re : Cox Model
> 
> Hi,
> 
> Thanks for the replies. I will probably try to perform 
> survival analysis on each of the genes to get gene-wise 
> p-values and then select the most significant (the ones that 
> are below a certain p-value) and proceed to a full cox 
> regression using the significant genes. Do you think that 
> this makes sense?
> 
> Thanks a lot,
> Eleni
> 
> On Feb 13, 2008 2:11 PM, <phguardiol at aol.com> wrote:
> 
> >  Hi,
> > wouldnt it make sense to first have data reduction dimensionality 
> > before undergoing such survival analysis ? Certainly, some of your 
> > genes have similar expression profiles across samples...?
> >  Best,
> >  Philippe Guardiola
> >
> >
> >  -----E-mail d'origine-----
> > De : Ramon Diaz-Uriarte <rdiaz at cnio.es> A : 
> > bioconductor at stat.math.ethz.ch Cc : Eleni Christodoulou 
> > <elenichri at gmail.com> Envoyé le : Me, 13 Février 2008 11:23 Sujet : 
> > Re: [BioC] Cox Model
> >
> >  Dear Eleni,
> >
> >
> > You are trying to fit a model with 18000 covariates but only 80 
> > samples (of
> >
> > which, at most, only 80 are not censored). Just doing it 
> the way you 
> > are
> >
> > trying to do it is unlikely to work or make much sense...
> >
> >
> > You might want to take a look at the work of Torsten Hothorn and 
> > colleagues on
> >
> > survival ensembles, with implementations in the R package 
> mboost, and 
> > their
> >
> > work on random forests for survival data (see R package 
> party). Some 
> > of this
> >
> > funcionality is also accessible through our web-based tool SignS
> >
> > (http://signs.bioinfo.cnio.es), which uses the above packages.
> >
> >
> > Depending on your exact question, you might also want to 
> look at the 
> > approach
> >
> > of Jelle Goeman, for testing whether sets of genes (e.g., 
> you complete 
> > 18000
> >
> > set of genes) are related to the outcome of interest 
> (survival in your case).
> >
> > Goeman's approach is available in the globaltest package from BioC.
> >
> >
> > Hope this helps,
> >
> >
> > R.
> >
> >
> >
> > On Wednesday 13 February 2008 08:10, Eleni Christodoulou wrote:
> >
> > > Hello BioC-community,
> >
> > >
> >
> > > It's been a week now that I am struggling with the 
> implementation of 
> > > a cox
> >
> > > model in R. I have 80 cancer patients, so 80 time 
> measurements and 
> > > 80
> >
> > > relapse or no measurements (respective to censor, 1 if 
> relapsed over 
> > > the
> >
> > > examined period, 0 if not). My microarray data contain 
> around 18000 genes.
> >
> > > So I have the expressions of 18000 genes in each of the 80 tumors 
> > > (matrix
> >
> > > 80*18000). I would like to build a cox model in order to retrieve 
> > > the most
> >
> > > significant genes (according to the p-value). The command 
> that I am 
> > > using
> >
> > > is:
> >
> > >
> >
> > > test1 <- list(time,relapse,genes)
> >
> > > coxph( Surv(time, relapse) ~ genes, test1)
> >
> > >
> >
> > > where time is a vector of size 80 containing the times, 
> relapse is a 
> > > vector
> >
> > > of size 80 containing the relapse values and genes is a 
> matrix 80*18000.
> >
> > > When I give the coxph command I retrieve an error saying 
> that cannot
> >
> > > allocate vector of size 2.7Mb  (in Windows). I also tried 
> linux and 
> > > then I
> >
> > > receive error that maximum memory is reached. I increase 
> the memory 
> > > by
> >
> > > initializing R with the command:
> >
> > > R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max-nsize=200M
> >
> > >
> >
> > > I think it cannot get better than that because if I try 
> for example
> >
> > > max-vsize=300 the memomry capacity is stored as NA.
> >
> > >
> >
> > > Does anyone have any idea why this happens and how I can 
> overcome it?
> >
> > >
> >
> > > I would be really grateful if you could help!
> >
> > > It has been bothering me a lot!
> >
> > >
> >
> > > Thank you all,
> >
> > > Eleni
> >
> > >
> >
> > >   [[alternative HTML version deleted]]
> >
> > >
> >
> > > _______________________________________________
> >
> > > Bioconductor mailing list
> >
> > > Bioconductor at stat.math.ethz.ch
> >
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> > > Search the archives:
> >
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> > --
> >
> > Ramón Díaz-Uriarte
> >
> > Statistical Computing Team
> >
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> >
> > (Spanish National Cancer Center)
> >
> > Melchor Fernández Almagro, 3
> >
> > 28029 Madrid (Spain)
> >
> > Fax: +-34-91-224-6972
> >
> > Phone: +-34-91-224-6900
> >
> > http://ligarto.org/rdiaz
> >
> > PGP KeyID: 0xE89B3462
> >
> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> >
> >
> >
> >
> > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y 
> > ...{{dropped:3}}
> >
> >
> > _______________________________________________
> >
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> > Search the archives: 
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 



More information about the Bioconductor mailing list