[R] Re: [S] Labels wrong with lrm

fharrell@virginia.edu fharrell at virginia.edu
Sat Jul 28 14:56:11 CEST 2001


Dear Jan,

Thank you very much for your excellent description of the
problem and the self-contained test code.  This is a
problem that I've been meaning to either document better
or solve for some time.  The root of the problem is with
the builtin S-Plus terms.inner function:

> attr(terms.inner(asthma ~ pol(age,kx) + smok),'variables')
expression(age, kx, smok)

You can see that terms.inner inappropriately includes kx
as an independent variable as it does not know that
the first argument to pol is the special variable.
When a constant replaces kx, all is well.

As I am relying on the C code called by terms.inner to 
do the job, I don't have a ready solution.  I would
be happy if someone comes up with a solution.  The
all.vars function in the R language has the same
limitation:

all.vars(asthma ~ pol(age,kx) + smok)
[1] "asthma" "age"   "kx"   "smok"

Frank Harrell




Jan Brogger wrote:
> 
> I get funny behaviour from lrm when using transformations that require a
> parameter, and the parameter is a variable. It only happens when there are
> other variables in the model. The transformation must be the last term in
> the formula, or some labels will be wrong. For example, if the correct
> labels are "age" "age^2" and "smok", the labels will be "age", "age^2" and
> "kx" where "kx" is actually the parameter for the transformation, and it
> should be "smok".
> 
> I have tried including the variable that contains the parameter for the
> transformation with datadist, but it doesn't help. Also tried reinstalling
> newest Hmisc, Design and starting new project directory. Platform: S+2000
> Win98. It happens to both rcs and pol. With glm, everything is OK. The
> value of coefficients are the same.
> 
> It isn't a major issue, but it might puzzle some and I thought I'd let you
> know. Solutions are: specify the parameters transformations directly, not
> in variables, or keep your transformations last in the formula. Any better
> solutions ? It'd be nice to be able to keep knot locations in a variable,
> so you can change it only one place in your code.
> 
> This code reproduces the problem:
> library(Hmisc,T)
> library(Design,T)
> age <- rnorm(500,41.5,12) #age
> smok <- sample(0:2,500,rep=T) #smoking habit
> asthma <- sample(0:1,500,rep=T)  #asthma
> mydat <- data.frame(age,smok,asthma)
> dd <- datadist(mydat)
> options(datadist="dd")
> kx <- 2
> 
> #this model gives the wrong labels
> lrm(asthma~pol(age,kx)+smok,data=mydat)
> #this model gives the right label
> lrm(asthma~smok+pol(age,kx),data=mydat)
> 
> #If you don't use a variable, but a value, then everything is ok
> lrm(asthma~smok+pol(age,2),data=mydat)
> lrm(asthma~pol(age,2)+smok,data=mydat)
> #If you only have one term, then everything is OK
> lrm(asthma~pol(age,2),data=mydat)
> #It doesn't happen with GLM:
> glm(asthma~pol(age,2)+smok,data=mydat,family=binomial)
> #but it does with rcs:
> kx <- c(15,40,70)
> lrm(asthma~rcs(age,kx)+smok,data=mydat)
> lrm(asthma~smok+rcs(age,kx),data=mydat)
> 
> Jan Brogger,
> University of Bergen, Norway
> ---------------------------------------------------------------------
> This message was distributed by s-news at lists.biostat.wustl.edu.  To
> unsubscribe send e-mail to s-news-request at lists.biostat.wustl.edu with
> the BODY of the message:  unsubscribe s-news

-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list