[R] survexp with large dataframes

Mike Harwood harwood262 at gmail.com
Wed Sep 28 19:02:03 CEST 2011


Hello, and thank you in advance.

I would like to capture the expected survival from a coxph model for
subjects in an observational study with recurrent events, but the
survexp statement is failing due to memory.  I am using R version
2.13.1 (2011-07-08) on Windows XP.

My objective is to plot the fitted survival with the Kaplan-Meier
plot.  Below is the code with output and [unfortunately] errors. Is
there something wrong in my use of cluster in generating the
proportional hazards model, or is there some syntax to pass it into
survexp?

Mike

> dim(dev)
[1] 899876     25

> mod1 <- coxph(Surv(begin.cp, end.cp, event)
+     ~ age.sex
+ 	+ plan_type
+ 	+ uw_load
+ 	+ cluster(mbr_key)
+ 	,data=dev
+ 	)
>
> summary(mod1)
Call:
coxph(formula = Surv(begin.cp, end.cp, event) ~ age.sex + plan_type +
    uw_load + cluster(mbr_key), data = dev)

  n= 899876, number of events= 753324

                         coef exp(coef)  se(coef) robust se       z
Pr(>|z|)
age.sex19-34_MALE   -0.821944  0.439576  0.005529  0.023298 -35.280  <
2e-16 ***
age.sex35-49_FEMALE  0.058776  1.060537  0.004201  0.018477   3.181
0.00147 **
age.sex35-49_MALE   -0.515590  0.597148  0.004634  0.019986 -25.798  <
2e-16 ***
age.sex50-64_FEMALE  0.190940  1.210386  0.004350  0.020415   9.353  <
2e-16 ***
age.sex50-64_MALE   -0.127514  0.880281  0.004487  0.021431  -5.950
2.68e-09 ***
age.sexCHILD_CHILD  -0.327522  0.720707  0.004238  0.017066 -19.192  <
2e-16 ***
plan_typeLOW        -0.165735  0.847270  0.002443  0.011080 -14.958  <
2e-16 ***
uw_load1-50          0.215122  1.240014  0.006437  0.029189   7.370
1.71e-13 ***
uw_load101-250       0.551042  1.735060  0.003993  0.018779  29.344  <
2e-16 ***
uw_load251+          0.981660  2.668884  0.003172  0.017490  56.126  <
2e-16 ***
uw_load51-100        0.413464  1.512046  0.006216  0.027877  14.832  <
2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                    exp(coef) exp(-coef) lower .95 upper .95
age.sex19-34_MALE      0.4396     2.2749    0.4200    0.4601
age.sex35-49_FEMALE    1.0605     0.9429    1.0228    1.0996
age.sex35-49_MALE      0.5971     1.6746    0.5742    0.6210
age.sex50-64_FEMALE    1.2104     0.8262    1.1629    1.2598
age.sex50-64_MALE      0.8803     1.1360    0.8441    0.9180
age.sexCHILD_CHILD     0.7207     1.3875    0.6970    0.7452
plan_typeLOW           0.8473     1.1803    0.8291    0.8659
uw_load1-50            1.2400     0.8064    1.1711    1.3130
uw_load101-250         1.7351     0.5763    1.6724    1.8001
uw_load251+            2.6689     0.3747    2.5789    2.7620
uw_load51-100          1.5120     0.6614    1.4316    1.5970

Concordance= 0.643  (se = 0 )
Rsquare= 0.205   (max possible= 1 )
Likelihood ratio test= 206724  on 11 df,   p=0
Wald test            = 9207  on 11 df,   p=0
Score (logrank) test = 246358  on 11 df,   p=0,   Robust = 4574  p=0

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do
not).

> dev.fit <- survexp( ~ 1, ratetable=mod1, data=dev)
Error in survexp.cfit(cbind(as.numeric(X), R), Y, conditional,
FALSE,  :
  cannot allocate memory block of size 15.2 Gb



More information about the R-help mailing list