[R] Clever R syntax for extracting a subset of observations

Gabor Grothendieck ggrothendieck at myway.com
Mon Apr 12 07:10:56 CEST 2004


sdratio <- function(D, d) with( D[d,], sd(x)/sd(y) )



Ajay Shah <ajayshah <at> mayin.org> writes:

: 
: I know that if:
:    x = seq(1,10)
:    d = c(7,3,2)
: and if I say
:    y = x[d]
: then I get the vector y as (7,3,2). Very clever! This idea is used
: intensively with the boot library.
: 
: Now consider the following code (which works):
: 
:   ---------------------------------------------------------------------------
:   library(boot)
: 
:   sdratio <- function(D, d) {
:     return(sd(D$x[d])/sd(D$y[d]))
:   }
: 
:   x = runif(100)
:   y = 2*runif(100)
:   D = data.frame(x, y)
: 
:   b = boot(D, sdratio, R=1000)
:   cat("Standard deviation of sdratio = ", sd(b$t[,1]), "\n")
:   ---------------------------------------------------------------------------
: 
: Now it would be so elegant to say:
: 
:   sdratio <- function(D, d) {
:     E = D[d]
:     return(sd(E$x)/sd(E$y))
:   }
: 
: But this doesn't work since if D is a data frame, you can't say
: D[d]. Let me show you:
: 
: > x = runif(100)
: > y = runif(100)
: > D = data.frame(x, y)
: > d = c(7,3,2)
: > E = D[d] 
: Error in "[.data.frame"(D, d) : undefined columns selected
: 
: Any suggestions on how one can do such pretty things as D[d] where D
: is a data frame?
:




More information about the R-help mailing list