set functions

Prof Brian D Ripley ripley@stats.ox.ac.uk
Wed, 5 Jan 2000 10:21:24 +0000 (GMT)


On Wed, 5 Jan 2000, Martin Maechler wrote:

>  On 4 Jan 2000, Peter Dalgaard BSA wrote:
>  > Watch:
>  > 
>  > > x<-1:50000
>  > > y<-x[order(runif(50000))]
>  > > "equiv2" <- function(x, y) all(c(match(x, y, 0)>0, match(y, x, 0)>0))
>  > > equiv<-function(x,y) 
>  > +     length(x<-unique(x))==length(y<-unique(y)) && 
>  > +     all(sort(x)==sort(y)) 
>  > > system.time(equiv2(x,y))
>  > [1] 3.10 0.02 3.00 0.00 0.00
>  > > system.time(equiv(x,y))
>  > [1] 0.77 0.00 1.00 0.00 0.00
> 
>     JonR> Yup -- that's much quicker!  To re-ask the original question,
>     JonR> would it be reasonable to include such a function along with the
>     JonR> other set functions?  Cheers, Jonathan.
> 
> quite a good idea, particularly, since we all have now learned that it is
> non-trivial to write really efficiently.

Some of us knew that. What worries me a bit is that optimizing code for the
current R may not be a good idea. R currently spends a lot of its
time on garbage collection (30 to 50% on my profiling) and it is planned to 
alter the memory allocator real soon now.  When hashing of environments
was introduced it made a lot of difference to some code, and little to
others.  That's not to say that we should not optimize, but
trying hard may be a waste of time. (Says he having learnt the hard way
across S-PLUS versions.)

Brian

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._