[R] Error in Design package: dataset not found for options(datadist)

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Apr 17 15:03:24 CEST 2008


Gad Abraham wrote:
> Hi,
> 
> Design isn't strictly an R base package, but maybe someone can explain 
> the following.
> 
> When lrm is called within a function, it can't find the dataset dd:
> 
>  > library(Design)
>  > age <- rnorm(30, 50, 10)
>  > cholesterol <- rnorm(30, 200, 25)
>  > ch <- cut2(cholesterol, g=5, levels.mean=TRUE)
>  > fit <- function(ch, age)
> + {
> +    d <- data.frame(ch, age)
> +    dd <- datadist(d)
> +    options(datadist="dd")
> +    lrm(ch ~ age, data=d, x=TRUE, y=TRUE)
> + }
>  > fit(ch, age)
> Error in Design(eval(m, sys.parent())) :
>    dataset dd not found for options(datadist=)
> 
> It works outside a function:
>  > d <- data.frame(ch, age)
>  > dd <- datadist(d)
>  > options(datadist="dd")
>  > l <- lrm(ch ~ age, data=d, x=TRUE, y=TRUE)
> 
> 
> Thanks,
> Gad

My guess is that you'll need to put dd in the global environment, not in 
fit's environment.  At any rate it is inefficient to call datadist every 
time.  Why not call it once for the whole data frame containing all the 
predictors, at the top of the program?

Also it is inefficient to chop continuous variables.  You can use the 
proportional odds model with continuous ch as a response variable 
although it will be slow if ch has more than, say, 100 unique values.

Frank

> 
> 
>  > sessionInfo()
> R version 2.6.2 (2008-02-08)
> x86_64-pc-linux-gnu
> 
> ...
> 
> attached base packages:
> [1] splines   stats     graphics  grDevices utils     datasets  methods
> [8] base
> 
> other attached packages:
> [1] Design_2.1-1  survival_2.34 Hmisc_3.4-3
> 
> loaded via a namespace (and not attached):
> [1] cluster_1.11.9  grid_2.6.2      lattice_0.17-4  rcompgen_0.1-17
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list