[R] ols function in rms package

Frank E Harrell Jr f.harrell at Vanderbilt.Edu
Mon Jun 7 15:22:37 CEST 2010


On 06/06/2010 10:49 PM, Mark Seeto wrote:
> Hello,
>
> I have a couple of questions about the ols function in Frank Harrell's rms
> package.
>
> Is there any way to specify variables by their column number in the data
> frame rather than by the variable name?
>
> For example,
>
> library(rms)
> x1<- rnorm(100, 0, 1)
> x2<- rnorm(100, 0, 1)
> x3<- rnorm(100, 0, 1)
> y<- x2 + x3 + rnorm(100, 0, 5)
> d<- data.frame(x1, x2, x3, y)
> rm(x1, x2, x3, y)
> lm(y ~ d[,2] + d[,3], data = d)  # This works
> ols(y ~ d[,2] + d[,3], data = d) # Gives error
> Error in if (!length(fname) || !any(fname == zname)) { :
>    missing value where TRUE/FALSE needed
>
> However, this works:
> ols(y ~ x2 + d[,3], data = d)
>
> The reason I want to do this is to program variable selection for
> bootstrap model validation.
>
> A related question: does ols allow "y ~ ." notation?
>
> lm(y ~ ., data = d[, 2:4])  # This works
> ols(y ~ ., data = d[, 2:4]) # Gives error
> Error in terms.formula(formula) : '.' in formula and no 'data' argument
>
> Thanks for any help you can give.
>
> Regards,
> Mark

Hi Mark,

It appears that you answered the questions yourself.  rms wants real 
variables or transformations of them.  It makes certain assumptions 
about names of terms.   The y ~ . should work though; sometime I'll have 
a look at that.

But these are the small questions compared to what you really want.  Why 
do you need variable selection, i.e., what is wrong with having 
insignificant variables in a model?  If you indeed need variable 
selection see if backwards stepdown works for you.  It is built-in to 
rms bootstrap validation and calibration functions.

Frank

-- 
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list