[R] Subarray specification problem

Roy Shimizu rshmz29 at gmail.com
Thu Dec 16 17:00:51 CET 2010


Hi.  I'm new to R, and I'm still learning R's system for addressing
subsets of data structures.  I'm particularly interested in the
problem of selecting subarrays based on complex criteria involving the
dimnames (as opposed to the values of the cells) of the array.  Here's
an example of such a problem.

Suppose I have an array x of unknown dimensions (it may have been
passed as the argument to a function I'm coding), but I know that one
of its dimensions is called "time", and has values that are (or can be
meaninfully coerced into) integers.  To make this specification
clearer, here's one possible example of such an array x:

> (x <- array(runif(2*5*2), dim=c(2,5,2), dimnames=list(NULL, time=round(100*runif(5)), NULL)))
, , 1

      time
              84        69        61         16        77
  [1,] 0.4020976 0.8250189 0.3402749 0.09754860 0.2189114
  [2,] 0.5309967 0.5414850 0.9431449 0.08716723 0.5819100

, , 2

      time
              84        69        61        16        77
  [1,] 0.6238213 0.1210083 0.7823269 0.5004058 0.5474356
  [2,] 0.2491087 0.7449411 0.9561074 0.6685954 0.3871533


Now, here's the problem: I want to write an R expression that will
give me the subarray y of x consisting of the cells whose "time"
dimensions are greater than 20 but less than 80.

For the example x given above, the desired expression would evaluate
to this array:

, , 1

      time
              69        61         77
  [1,] 0.8250189 0.3402749 0.2189114
  [2,] 0.5414850 0.9431449 0.5819100

, , 2

      time
              69        61        77
  [1,] 0.1210083 0.7823269 0.5474356
  [2,] 0.7449411 0.9561074 0.3871533



How can I write such an expression in the general array x as described above?

Remember, the x shown above is just an example.  In the general case
all I know is that one of x's dimensions is called "time", and that
its values are [or can be coerced meaningfully] into integers.  I
*don't* know where among x's dimensions it is.  Hence, the following
is *not* a solution to the problem, even though it produces the right
answer for the example above:

> t <- as.integer(dimnames(x)$time)
> y <- x[,which(t > 20 & t < 80),]

This solution does not work in general, because the expression
"x[,which(t > 20 & t < 80),]" relies on the prior knowledge that the
"time" dimension is the second one of three.

Any ideas?

Thanks!

Roy



More information about the R-help mailing list