[R] multi-dimensional hash

Thomas W Blackwell tblackw at umich.edu
Thu Oct 2 15:11:49 CEST 2003


Arne  -

In the past, I've used a data frame for the lookup table and
the "and" of individual logical vectors to select rows from it.
Here's a simplified version of the selector function I wrote.
My mail editor does not balance parentheses, so I don't guarantee
that this version is syntactically correct.  Various sanity
checks have been omitted for clarity.

 #  return row numbers for the conjunction of matches to multiple
 #  named columns of data.  match is to the cross-product of indiv
 #  args and rows are returned in data set order.  the values to
 #  select for should be supplied as named arguments in ... .

select <- function(data, ...)
     {	test <- list(...)
	col.index <- match(names(test), names(data), 0)

	seq(length(data[[1]]))[ apply(sapply(seq(along=test),
	   function(i,d,n,t) match(d[[n[i]]], t[[i]], 0) > 0,
	   data, col.index, test), 1, all) ]			  }

In my own code, select() is called by any of several wrapper
functions which implement specific queries, and supply the names
of the columns on which they want to select.  The results from
several queries are saved as integer row numbers, then combined
and re-sorted before I actually subscript the data frame with
them to extract the data.

HTH  -  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Thu, 2 Oct 2003 Arne.Muller at aventis.com wrote:

> I was wondering what's the best data structure in R for a multi-dimensional
> lookup table, and how to implement it. I've several categories say "A", "B",
> "C" ... and within each of these categories there are other categories such
> as "a", "b", "c", ... . There can be up to 5 dimensions. The actual value for
> [A][a]... is then a vector.
>
> 	I'm looking forward to any suggestions,
> 	+thanks very much for your help,
>
> 	Arne
>




More information about the R-help mailing list