[R] Defining reference category for a cph model summary inside of a "for" loop

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sat Mar 29 01:33:49 CET 2008


Wells, Brian wrote:
> I have the following code. 
> 
>  
> 
>  
> 
>> f <- cph(formula = Surv(TimeToDeath, Dead == "Yes")
> ~1,data=single.dat, x=T, y=T, surv=T)
> 
>> for(i in c('A', 'B', 'C', 'D', 'E', 'F')){
> 
>> f <-update(f,as.formula(paste('Surv(TimeToDeath, Dead ==
> "Yes")~',i,sep='')))
> 
>> print(summary(f, paste(i,"=1st Quartile", sep='')))
> 
>  
> 
>  
> 
> There is no error message generated in R, but R ignores the reference
> category defined with paste in the summary function for the cph model. 
> 
>  
> 
> The output uses the "1st Quartile" as the reference category to
> calculate hazards for some of the variables defined by i, but not all of
> them. 


Your code is confusing.  What is to the right of ~ in a formula is a 
predictor variable name, not a value.  If your variables are named A, B, 
C, ... you are OK.

'1st Quartile' has no special meaning to R or Design, and you can't pass 
a character string as a second argument to summary and expect it to work.

You will need parse(text=paste(...)) to create an appropriate expression.

But Design gives you inter-quartile range hazard ratios by default anyway.

Beware of getting hazard ratios that are not adjusted for other 
variables needed in the model.

Frank Harrell

> 
>  
> 
>  
> 
> Any help would be greatly appreciated. 
> 
>  
> 
> thanks
> 
>  
> 
> Brian J. Wells, MD, MS
> 
> Research Associate
> 
> Quantitative Health Sciences
> 
> Cleveland Clinic
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list