[R] Help understanding why glm and lrm.fit runs with my data, but lrm does not

David Winsemius dwinsemius at comcast.net
Thu Sep 14 17:48:21 CEST 2017


> On Sep 14, 2017, at 12:30 AM, Bonnett, Laura <L.J.Bonnett at liverpool.ac.uk> wrote:
> 
> Dear all,
> 
> I am using the publically available GustoW dataset.  The exact version I am using is available here: https://drive.google.com/open?id=0B4oZ2TQA0PAoUm85UzBFNjZ0Ulk
> 
> I would like to produce a nomogram for 5 covariates - AGE, HYP, KILLIP, HRT and ANT.  I have successfully fitted a logistic regression model using the "glm" function as shown below.
> 
> library(rms)
> gusto <- spss.get("GustoW.sav")
> fit <- glm(DAY30~AGE+HYP+factor(KILLIP)+HRT+ANT,family=binomial(link="logit"),data=gusto,x=TRUE,y=TRUE)
> 
> However, my review of the literature and other websites suggest I need to use "lrm" for the purposes of producing a nomogram.  When I run the command using "lrm" (see below) I get an error message saying:
> Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
>  Unable to fit model using "lrm.fit"
> 
> My code is as follows:
> gusto2 <- gusto[,c(1,3,5,8,9,10)]
> gusto2$HYP <- factor(gusto2$HYP, labels=c("No","Yes"))
> gusto2$KILLIP <- factor(gusto2$KILLIP, labels=c("1","2","3","4"))
> gusto2$HRT <- factor(gusto2$HRT, labels=c("No","Yes"))
> gusto2$ANT <- factor(gusto2$ANT, labels=c("No","Yes"))
> var.labels=c(DAY30="30-day Mortality", AGE="Age in Years", KILLIP="Killip Class", HYP="Hypertension", HRT="Tachycardia", ANT="Anterior Infarct Location")
> label(gusto2)=lapply(names(var.labels),function(x) label(gusto2[,x])=var.labels[x])
> 
> ddist = datadist(gusto2)
> options(datadist='ddist')
> 
> fit1 <- lrm(DAY30~AGE+HYP+KILLIP+HRT+ANT,gusto2)
> 
> Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
>  Unable to fit model using "lrm.fit"
> 
> Online solutions to this problem involve checking whether any variables are redundant.  However, the results for my data suggest  that none are.
> redun(~AGE+HYP+KILLIP+HRT+ANT,gusto2)
> 
> Redundancy Analysis
> 
> redun(formula = ~AGE + HYP + KILLIP + HRT + ANT, data = gusto2)
> 
> n: 2188         p: 5    nk: 3
> 
> Number of NAs:   0
> 
> Transformation of target variables forced to be linear
> 
> R-squared cutoff: 0.9   Type: ordinary
> 
> R^2 with which each variable can be predicted from all other variables:
> 
>   AGE    HYP KILLIP    HRT    ANT
> 0.028  0.032  0.053  0.046  0.040
> 
> No redundant variables
> 
> I've also tried just considering "lrm.fit" and that code seems to run without error too:
> lrm.fit(cbind(gusto2$AGE,gusto2$KILLIP,gusto2$HYP,gusto2$HRT,gusto2$ANT),gusto2$DAY30)
> 
> Logistic Regression Model
> 
> lrm.fit(x = cbind(gusto2$AGE, gusto2$KILLIP, gusto2$HYP, gusto2$HRT,
>     gusto2$ANT), y = gusto2$DAY30)
> 
>                       Model Likelihood     Discrimination    Rank Discrim.
>                          Ratio Test           Indexes           Indexes
> Obs          2188    LR chi2     233.59    R2       0.273    C       0.846
>  0           2053    d.f.             5    g        1.642    Dxy     0.691
>  1            135    Pr(> chi2) <0.0001    gr       5.165    gamma   0.696
> max |deriv| 4e-09                          gp       0.079    tau-a   0.080
>                                            Brier    0.048
> 
>           Coef     S.E.   Wald Z Pr(>|Z|)
> Intercept -13.8515 0.9694 -14.29 <0.0001
> x[1]        0.0989 0.0103   9.58 <0.0001
> x[2]        0.9030 0.1510   5.98 <0.0001
> x[3]        1.3576 0.2570   5.28 <0.0001
> x[4]        0.6884 0.2034   3.38 0.0007
> x[5]        0.6327 0.2003   3.16 0.0016
> 
> I was therefore hoping someone would explain why the "lrm" code is producing an error message, while "lrm.fit" and "glm" do not.  In particular I would welcome a solution to ensure I can produce a nomogram.

Try this:

lrm  # look at code, do a search on "fail"
?lrm.fit  # read the structure of the returned value of lrm.fit

my.fit <- lrm.fit(x = cbind(gusto2$AGE, gusto2$KILLIP, gusto2$HYP, gusto2$HRT,
    gusto2$ANT), y = gusto2$DAY30)

print(my.fit$fail)  # the error message you got from the lrm call means convergence failed

Documentation bug: The documentation of the cause of the 'fail'- value incorrectly gives the name of this parameter as 'maxiter' in the  Value section.

-- 
David.



> 
> Kind regards,
> Laura
> 
> Dr Laura Bonnett
> NIHR Post-Doctoral Fellow
> 
> Department of Biostatistics,
> Waterhouse Building, Block F,
> 1-5 Brownlow Street,
> University of Liverpool,
> Liverpool,
> L69 3GL
> 
> 0151 795 9686
> L.J.Bonnett at liverpool.ac.uk
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law



More information about the R-help mailing list