[R] Incorrect 'n' returned by survfit()

yongchuan panyc at pacific.net.sg
Wed Oct 25 15:33:02 CEST 2006


I've a data set with 60000 rows of data representing 6000+ distinct loans. I did a coxph() regression on it (see call below), but a subsequent survfit() call on the coxph object is almost certainly wrong. It gives n=6 when it should be 
more like 6000+ (I think)

> survfit(resultag)
Call: survfit.coxph(object = resultag)

      n  events  median 0.95LCL 0.95UCL 
      6     489     Inf       2     Inf 

When I reduced the dataset to just 1000 rows, the survfit()
call on the coxph object looks more correct. 

> survfit(resulting)
Call: survfit.coxph(object = resulting)

      n  events  median 0.95LCL 0.95UCL 
    115      15     Inf     Inf     Inf 

Is there a limit to the size of the data set that I read in?
Or am I just doing something silly above?

Thanks much.
Yongchuan

(this is the coxph regression:
resultag <- coxph(Surv(Start,Stop,PrepayDate)~modBalance + closingCoupon+lienPosition +originalFICO,table)



More information about the R-help mailing list