[BioC] gene expression data followup

James W. MacDonald jmacdon at med.umich.edu
Fri Nov 14 14:55:06 CET 2008

An example using the sample.ExpressionSet dataset.

> library("survival")
> library("Biobase")
> data(sample.ExpressionSet)

## fake up some survival time - there are 26 observations
## let's say we have survival time =< 36 months for all patients
## with some amount of censoring
> surv.time <- Surv(sample(1:36, 26, replace=T), sample(0:1, 26,
replace=T))
> surv.time
  6+ 35  17+ 18  11  35  15  15+  7+ 14+ 31  12+ 15+  1+ 14+ 24  30
19+  8+
 25+ 22   4+ 21+  3  23  18+

## fit model with first gene

> mod <- coxph(surv.time~exprs(sample.ExpressionSet)[1,])
> summary(mod)
Call:
coxph(formula = surv.time ~ exprs(sample.ExpressionSet)[1, ])

n= 26
coef exp(coef) se(coef)     z   p
exprs(sample.ExpressionSet)[1, ] 0.00656      1.01  0.00782 0.839 0.4

exp(coef) exp(-coef) lower .95 upper .95
exprs(sample.ExpressionSet)[1, ]      1.01      0.993     0.991      1.02

Rsquare= 0.026   (max possible= 0.793 )
Likelihood ratio test= 0.68  on 1 df,   p=0.411
Wald test            = 0.7  on 1 df,   p=0.402
Score (logrank) test = 0.72  on 1 df,   p=0.396

##OK, so not significant - let's plot anyway

> plot(survfit(mod))

You can just wrap this up in a call to apply to do all genes. In
addition, you could pull out the LR test statistic/p-value as a first
pass to see which genes are significant, and then go back and just plot
those genes.

Best,

Jim

> dear group,
> how can expression data for a group of genes can be correlated to
> survival covariate data using cox model and plot a kaplan-mier curve.
> say i have subset of data from matrix MxN (M genes and N samples). I
> take expression values for YxN (subset of M genes is Y and N are
> collection of cancer and normal) and use recurrance time or survival
> time and check if Y genes are sifnificant under cox model for
> recurrance. if they are sifnificant plot them using kaplan-m curve. I
> want to be able to use coxph and survh functions. I do not know how to
> use both expression data and survival covariate data and see if set of
> genes are sifnificant.
> thanks
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-5646
734-936-8662