tabulate

Prof Brian D Ripley ripley@stats.ox.ac.uk
Tue, 25 Jan 2000 07:04:31 +0000 (GMT)


On 25 Jan 2000, Peter Dalgaard BSA wrote:

> Bill Venables <William.Venables@cmis.CSIRO.AU> writes:
> 
> > OK Peter.  This is the first one I cooked up:
> ...
> > > m <- rpois(100000, 1)
> > > tabulate(m)
> > [1] 36891 18399  6064  1519   309    50     4     1
> > > table(m)
> > m
> >     0     1     2     3     4     5     6     7     8 
> > 36763 36891 18399  6064  1519   309    50     4     1 
> > > system.time(tabulate(m))
> > [1] 0.11 0.00 0.00 0.00 0.00
> > > system.time(table(m))
> > [1] 2.90 0.16 4.00 0.00 0.00
> > > version
> 
> OK first, notice that I get:
> 
> > system.time(table(m))
> [1] 3.38 0.00 3.38 0.00 0.00
> > system.time(f<-factor(m))
> [1] 2.12 0.00 2.12 0.00 0.00
> > system.time(table(f))
> [1] 1.19 0.00 1.20 0.00 0.00
> 
> so most of the time really goes into factor(). If one is careful about
> the innards of table() one can shave the time for that to 
> 
> > system.time(tab2(f))
> [1] 0.66 0.01 0.67 0.00 0.00
> 
> Rather interestingly, the non constant time part of table would seem
> equivalent to 
> 
> > system.time(as.integer(0)+as.integer(1)*(as.integer(f)-as.integer(1)))
> [1] 0.25 0.00 0.25 0.00 0.00
> > system.time(as.integer(0)+as.integer(1)*(as.integer(f)-as.integer(1)))
> [1] 0.07 0.00 0.07 0.00 0.00
> 
> Notice the huge difference in the two executions, indicating that the
> number of garbage collections involved probably play a major role.
> 
> On the whole it doesn't really seem to be worth it to obtimize this
> very heavily, but if you have any obvious improvements for factor()...

Almost all the time is going on match(), which is 25x slower on my system
than S-PLUS 5.1 for this example.  I recall that we see this problem
in slow model.* manipulations too.

I don't know why match is slow: it does use hashing but may not be
optimized for matches into small sets.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._