[R] I have a problem

David Winsemius dwinsemius at comcast.net
Sat Jul 31 17:15:56 CEST 2010


On Jul 31, 2010, at 10:26 AM, 笑啸 wrote:

> dear£ºDavid Winsemius
>   in  the  example£¨nomogram£©£¬I don't understand the meanings  
> of the program which have been  marked by comment  line.And how to  
> compile the program.

R is an interpreted language. There is no compilation. You submit  
these lines of text to the console by one of:
a) copying and pasting or
b) you bring them in from a text file with source() or
c) in this case you could get them to run automagically by typing:

# require(rms) ... I assume you already have that package loaded
example(nomogram)

> (L <- .4*(sex=='male') + .045*(age-50) +(log(cholesterol -  
> 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))).
>
>
> example(nomogram)
> n <- 1000 # define sample size
> set.seed(17) # so can reproduce the results
> age <- rnorm(n, 50, 10)
> blood.pressure <- rnorm(n, 120, 15)
> cholesterol <- rnorm(n, 200, 25)
> sex <- factor(sample(c('female','male'), n,TRUE))
>
>
> #---need explanation for code starting here-----
>
>
> # Specify population model for log odds that Y=1
> L <- .4*(sex=='male') + .045*(age-50) +
> (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))

This line takes input from all the variables created above and creates  
a summary simulation value for each "subject" on the log-odds scale.  
It is used as input to a random number function below that returns a 1  
or 0 based on whether or not plogis(L) is above or below a uniform  
random value between 0 and 1. You can alter the simulation model by  
changing the parameters in that L function. If the lrm function is  
applied properly it will be able to "recover" or "discover" those  
parameter values. For instance there should be an age coefficient of  
0.045. The other coefficients might be more difficult to extract  
simply, since there is an implied interaction between the cholesterol  
and gender variables.

If you run an lrm model with a linear term for age (which is not how  
Harrell's examples are set up)

lrm(y ~ age+sex*rcs(cholesterol,4)+blood.pressure)

you get an estimate for age, estimates for an interaction between sex  
and cholesterol, and an estimate for blood pressures (which was not  
used in construction of the simulated values so the the BP estimate  
should be close to zero.)

... and the line for the estimate for the age term is
                          Coef       S.E.     Wald Z P
age                        0.034953 0.006672  5.24  0.0000

so the "true value of the age effect" (0.045) is within the 95% CI  
estimate (0.0349 +/- 1.96*0.0066).

... and the estimate for blood pressure has an estimate whose 95% CI  
includes zero, i.e. is not "statistically significant".
                          Coef       S.E.     Wald Z P
blood.pressure            -0.002083 0.004366 -0.48  0.6333

I suspect you need to read up in your logistic regression texts as  
well as reviewing further documents from Harrell's website.:

http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS

>
>
> # ---end----
>
>
> # Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
> y <- ifelse(runif(n) < plogis(L), 1, 0)
> ddist <- datadist(age, blood.pressure, cholesterol, sex)
> options(datadist='ddist')
> f <- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure)
> nom <- nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis
> fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999),
> funlabel="Risk of Death")
> #Instead of fun.at, could have specified fun.lp.at=logit of
> ...
> Thank you for you expain !
>
-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list