[R] freetype 2.5.2, problem with the survival package, build R 2.15.x with gcc 4.8.x

Hin-Tak Leung htl10 at users.sourceforge.net
Fri Dec 13 16:20:39 CET 2013


Thanks for taking the time to look at this issue with changes in the survival package. 

There are warnings (different ones) with any of the versions mentioned - so it is difficult to tell what to not ignore; -). In fact it is somewhat central and essential to the surrounding text 's argument that the matrices are expected to be singular, but the rank is used as an indicator to explore interaction - and lacking there of - of parameters. As far as I know this is also a stripped down version of a published paper. 

Scientific discovery which relies on tuning of numbers sounds "wonderful" - perhaps it is just bad science & bad use of statistics. This sounds like picking one's tuning number to fit one's desired conclusions. 

Would it be appropriate to generate an error with stop(), or something more visible? Yes, it would br nice to "hand-hold" the fitting, but at least in the outlined usage in that vignette, a large number of models are thrown in with a lot of data to see "what sticks", hand-holding individual fits is neither feasible nor appropriate. 

(I didn't write that vignette - I am just checking through it as a small part of a bigger task.)

Hin-tak 

------------------------------
On Fri, Dec 13, 2013 13:01 GMT Terry Therneau wrote:

>I was sent a copy of the data, and this is what I get on a different machine:
> fit <- clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set),
>         data=pscc)
>Warning messages:
>1: In fitter(X, Y, strats, offset, init, control, weights = weights,  :
>  Loglik converged before variable  1,2,3,4 ; beta may be infinite.
>2: In coxph(formula = Surv(rep(1, 13110L), cc) ~ addContr(A) + addContr(C) +  :
>  X matrix deemed to be singular; variable 5
>
> fit
>                     coef exp(coef) se(coef)         z  p
>addContr(A)2     -0.14250     0.867   110812 -1.29e-06  1
>addContr(C)2      0.00525     1.005   110812  4.74e-08  1
>addContr(A.C)1-2 -0.30097     0.740   110812 -2.72e-06  1
>addContr(A.C)2-1 -0.47712     0.621   110812 -4.31e-06  1
>addContr(A.C)2-2       NA        NA        0        NA NA
>
> xmat <- model.matrix(fit)
> svd(xmat)$d
>[1] 1.932097e+02 2.700101e+01 1.624731e+01 6.049630e-15 2.031334e-15
>
>The primary issue is that the covariates matrix is singular, having rank 3 instead of rank 5.
>The coxph routine prints two warning messages that things are not good about the matrix. Warning messages should not be ignored!  The insane se(coef) values in the printed result are an even bigger clue that the model fit is suspect. Unfortunately, some small change in the iteration path or numerics has put this data set over the edge from being seen as rank 3 (old run) to rank 4 (new run).  Moral: coxph does pretty well at detecting redundat variables, but if you know of some it never hurts to help the routine out by removing them before the fit.
>
>Singularity of the X matrix in a Cox model is very difficult to detect reliably; the current threshold is the result of long experience and experiment to give as few false messages as possible.  (The RMS package in particular used truncated power basis functions for the splines, which lead to X matrices that look almost singular numerically, but are not.)  Setting a little less stringent threshold for declaring singularity in the cholesky decompostion sufficies for this data set.
>
>fit2 <- clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set),
>         data=pscc, toler.chol=1e-10)
>
>I'll certainly add this to my list of test problems that I use to tune those constants.
>
>Terry Therneau
>
>On 12/11/2013 09:30 PM, Hin-Tak Leung wrote:
> Here is a rather long discussion etc about freetype 2.5.2, problem with the survival
> package, and build R 2.15.x with gcc 4.8.x. Please feel free to skip forward.
> 
> - freetype 2.5.2:
> 
> the fix to cope with one of the Mac OS X's system fonts just before the release of
> freetype 2.5.1 caused a regression, crashing over one of Microsoft windows' system fonts.
> So there is a 2.5.2. There are new 2.5.2 bundles for windows & Mac OS X. The official
> win/mac binaries of R were built statically with 2+-years-old freetype with a few known
> problems. Most should upgrade/rebuild.
> 
> http://sourceforge.net/projects/outmodedbonsai/files/R/
> 
> - problem with the survival package:
> 
> Trying to re-run a vignette to get the same result as two years ago
> reveal a strange change. I went and bisected it down to
> r11513 and r11516 of the survival package.
> 
> -------------- r11513 --------------------
> clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set))
> 
> 
>                     coef exp(coef) se(coef)     z      p
> addContr(A)2     -0.620     0.538    0.217 -2.86 0.0043
> addContr(C)2      0.482     1.620    0.217  2.22 0.0270
> addContr(A.C)1-2 -0.778     0.459    0.275 -2.83 0.0047
> addContr(A.C)2-1     NA        NA    0.000    NA     NA
> addContr(A.C)2-2     NA        NA    0.000    NA     NA
> 
> Likelihood ratio test=26  on 3 df, p=9.49e-06  n= 13110, number of events= 3524
> ------------------------------------------
> 
> ------------- r11516 ---------------------
> clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set))
> 
> 
>                       coef exp(coef) se(coef)         z  p
> addContr(A)2     -0.14250     0.867   110812 -1.29e-06  1
> addContr(C)2      0.00525     1.005   110812  4.74e-08  1
> addContr(A.C)1-2 -0.30097     0.740   110812 -2.72e-06  1
> addContr(A.C)2-1 -0.47712     0.621   110812 -4.31e-06  1
> addContr(A.C)2-2       NA        NA        0        NA NA
> 
> Likelihood ratio test=26  on 4 df, p=3.15e-05  n= 13110, number of events= 3524
> ------------------------------------------
> 



More information about the R-help mailing list