[BioC] Cox Model

A Gusnanto arief at maths.leeds.ac.uk
Wed Feb 13 10:40:32 CET 2008


Hi Eleni,

The analysis you have in mind has been described in Pawitan et al.
Statistics in Medicine 2004; 23:1767–1780 (DOI: 10.1002/sim.1769).

Prof. Pawitan gave me the R codes for the analysis long time ago, but I
can't find it in my computer at present. I will try to look for it in my
old archives, and if I find it, I'll let you know.

If you are interested to identify significant genes using this type of
analysis, no gene will turn up significant (in the sense of having
gene-wise 95% CI away from zero). This is due to the limited number of
samples you use in estimating 18,000 parameters (sparse
variance-covariance matrix involved). Further details are described in
the paper.

Although I have never tried this, my suggestion would be to perform
survival analysis on each of the genes to get gene-wise p-values, and
control for false discoveries using FDR.

Regards,

Arief

--
Dr. Arief Gusnanto
Dept. of Statistics
University of Leeds
Leeds LS2 9JT
United Kingdom
Phone +44 113 3435135
Fax   +44 113 3435090
Email arief at maths.leeds.ac.uk


On Wed, 2008-02-13 at 09:10 +0200, Eleni Christodoulou wrote:
> Hello BioC-community,
> 
> It's been a week now that I am struggling with the implementation of a cox
> model in R. I have 80 cancer patients, so 80 time measurements and 80
> relapse or no measurements (respective to censor, 1 if relapsed over the
> examined period, 0 if not). My microarray data contain around 18000 genes.
> So I have the expressions of 18000 genes in each of the 80 tumors (matrix
> 80*18000). I would like to build a cox model in order to retrieve the most
> significant genes (according to the p-value). The command that I am using
> is:
> 
> test1 <- list(time,relapse,genes)
> coxph( Surv(time, relapse) ~ genes, test1)
> 
> where time is a vector of size 80 containing the times, relapse is a vector
> of size 80 containing the relapse values and genes is a matrix 80*18000.
> When I give the coxph command I retrieve an error saying that cannot
> allocate vector of size 2.7Mb  (in Windows). I also tried linux and then I
> receive error that maximum memory is reached. I increase the memory by
> initializing R with the command:
> R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max-nsize=200M
> 
> I think it cannot get better than that because if I try for example
> max-vsize=300 the memomry capacity is stored as NA.
> 
> Does anyone have any idea why this happens and how I can overcome it?
> 
> I would be really grateful if you could help!
> It has been bothering me a lot!
> 
> Thank you all,
> Eleni
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list