[R] Different behavior of model.matrix between R 3.2 and R 3.1.1

Frank Harrell f.harrell at Vanderbilt.Edu
Thu Jun 11 01:34:17 CEST 2015


For building design matrices for Cox proportional hazards models in the 
cph function in the rms package I have always used this construct:

Terms <- terms(formula, specials=c("strat", "cluster", "strata"), data=data)
specials <- attr(Terms, 'specials')
stra    <- specials$strat
Terms.ns     <- Terms
     if(length(stra)) {
       temp <- untangle.specials(Terms.ns, "strat", 1)
       Terms.ns <- Terms.ns[- temp$terms]    #uses [.terms function
     }
X <- model.matrix(Terms.ns, X)[, -1, drop=FALSE]

The Terms.ns logic removes stratification factor "main effects" so that 
if a stratification factor interacts with a non-stratification factor, 
only the interaction terms are included, not the strat. factor main 
effects. [In a Cox PH model stratification goes into the nonparametric 
survival curve part of the model].

Lately this logic quit working; model.matrix keeps the unneeded main 
effects in the design matrix.  Does anyone know what changed in R that 
could have caused this, and possibly a workaround?

Note that cph is a kind of front-end to the survival package's coxph 
function.  Therry Therneau uses more complex logic to construct the 
design matrix reliably.  I'd like to avoid that logic because it creates 
an overly wide design matrix before removing the unneeded columns.

Thanks for any assistance,
Frank


-- 
------------------------------------------------------------------------
Frank E Harrell Jr 	Professor and Chairman 	School of Medicine

	Department of *Biostatistics* 	*Vanderbilt University*


	[[alternative HTML version deleted]]



More information about the R-help mailing list