# [R] counting the occurrences of vectors

Gabor Grothendieck ggrothendieck at myway.com
Tue Jul 6 06:22:22 CEST 2004

```Marc Schwartz <MSchwartz <at> MedAnalytics.com> writes:

> the likely overhead involved in paste()ing together the rows
> to create objects

I thought I would check this and it seems that in my original f1 function
its not really the paste itself that's the bottleneck but applying the
paste.  If we use do.call rather than apply, as shown in f1a below, then
we see that f1a runs faster than row.match.count (which in turn was faster
than f1):

f1a <- function(a,b,sep=":") {
f <- function(...) paste(..., sep=sep)
a2 <- do.call("f", as.data.frame(a))
b2 <- do.call("f", as.data.frame(b))
c(table(c(b2,unique(a2)))[a2] - 1)
}

> set.seed(1)
> # note that we have increased the size of the matrices from last post
> # to better show the speed difference
> a <- matrix(sample(3,10000,rep=T),nc=5)
> b <- matrix(sample(3,1000,rep=T),nc=5)

> # row.match.count taken from Marc's post in this thread
> # have put a c(...) around row.match.count to make it comparable to f1a
> gc(); system.time(ans <- c(row.match.count(b,a)))
used (Mb) gc trigger (Mb)
Ncells 436079 11.7     741108 19.8
Vcells 130663  1.0     786432  6.0
 0.11 0.00 0.11   NA   NA

> gc(); system.time(ansf1a <- f1a(b,a))
used (Mb) gc trigger (Mb)
Ncells 436080 11.7     741108 19.8
Vcells 130669  1.0     786432  6.0
 0.04 0.00 0.04   NA   NA

> all.equal(ansf1a,ans)
 TRUE
>

```