[R] is match slow?
tlumley at u.washington.edu
Tue Nov 20 18:35:38 CET 2001
On Tue, 20 Nov 2001, Agustin Lobo wrote:
> I'm doing
> m <- match(matriz, origen, 0)
> where matriz is a 270x900 matrix and
> origen a 11675 elements vector, and is taking
> a very long time.
> Is match a function
> implemented in C? If not, would a C
> code be faster?
Well, typing the function name at the R prompt gives
function (x, table, nomatch = NA, incomparables = FALSE)
if (!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
.Internal(match(if (is.factor(x)) as.character(x) else x,
if (is.factor(table)) as.character(table) else table,
showing that it is .Internal and thus in compiled C code. Looking at
src/main/unique.c reveals that it is implemented by sticking `table' in a
hash table and looking up each element of x, which is a pretty good
algorithm for this problem. If the hash function is good it will take
about length(table)+length(x) hash computations, and you won't be able to
beat that easily.
I don't even find it that slow
 0.27 0.01 0.33 0.00 0.00
or with a lot of matches
 0.01 0.00 0.01 0.00 0.00
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help