[R] Clever R syntax for extracting a subset of observations

Ajay Shah ajayshah at mayin.org
Thu Apr 8 17:54:30 CEST 2004


I know that if:
   x = seq(1,10)
   d = c(7,3,2)
and if I say
   y = x[d]
then I get the vector y as (7,3,2). Very clever! This idea is used
intensively with the boot library.

Now consider the following code (which works):

  ---------------------------------------------------------------------------
  library(boot)

  sdratio <- function(D, d) {
    return(sd(D$x[d])/sd(D$y[d]))
  }

  x = runif(100)
  y = 2*runif(100)
  D = data.frame(x, y)

  b = boot(D, sdratio, R=1000)
  cat("Standard deviation of sdratio = ", sd(b$t[,1]), "\n")
  ---------------------------------------------------------------------------

Now it would be so elegant to say:

  sdratio <- function(D, d) {
    E = D[d]
    return(sd(E$x)/sd(E$y))
  }

But this doesn't work since if D is a data frame, you can't say
D[d]. Let me show you:

> x = runif(100)
> y = runif(100)
> D = data.frame(x, y)
> d = c(7,3,2)
> E = D[d] 
Error in "[.data.frame"(D, d) : undefined columns selected

Any suggestions on how one can do such pretty things as D[d] where D
is a data frame?

-- 
Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi




More information about the R-help mailing list