[Rd] speedbump in library

Mon Jan 26 14:12:55 CET 2015

A isNamespaceLoaded() function would be a useful thing to have in
general if we are interested in readable code. An efficient
implementation would be just a bonus.

On Mon, Jan 26, 2015 at 3:36 AM, Martin Maechler
<maechler at lynne.stat.math.ethz.ch> wrote:
>>>>>> Winston Chang <winstonchang1 at gmail.com>
>>>>>>     on Fri, 23 Jan 2015 10:15:53 -0600 writes:
>
>     > I think you can simplify a little by replacing this:
>
>     >   pkg %in% loadedNamespaces()
>     > with this:
>     >   .getNamespace(pkg)
>
> almost:  It would be
>
>       !is.null(.getNamespace(pkg))
>
>     > Whereas getNamespace(pkg) will load the package if it's not already
>     > loaded, calling .getNamespace(pkg) (note the leading dot) won't load
>     > the package.
>
> indeed.
> And you, Winston, are right that this new code snippet would be
> an order of magnitude faster :
>
> ##-----------------------------------------------------------------------------
>
> f1 <- function(pkg) pkg %in% loadedNamespaces()
> f2 <- function(pkg) !is.null(.getNamespace(pkg))
>
> require(microbenchmark)
>
> pkg <- "foo"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ##           expr    min      lq     mean  median      uq    max neval cld
> ##  r1 <- f1(pkg) 38.516 40.9790 42.35037 41.7245 42.4060 82.922   100   b
> ##  r2 <- f2(pkg)  1.331  1.8285  2.13874  2.0855  2.3365  7.252   100  a
> ## [1] FALSE
>
> pkg <- "stats"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ##           expr    min      lq     mean  median      uq    max neval cld
> ##  r1 <- f1(pkg) 29.955 31.2575 32.27748 31.6035 32.1215 62.428   100   b
> ##  r2 <- f2(pkg)  1.067  1.4315  1.71437  1.6335  1.8460  9.169   100  a
> ## [1] TRUE
> loadNamespace("Matrix")
> ## <environment: namespace:Matrix>
> pkg <- "Matrix"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ##           expr    min      lq     mean  median      uq    max neval cld
> ##  r1 <- f1(pkg) 32.721 33.5205 35.17450 33.9505 34.6050 65.373   100   b
> ##  r2 <- f2(pkg)  1.010  1.3750  1.93671  1.5615  1.7795 12.128   100  a
> ## [1] TRUE
>
> ##-----------------------------------------------------------------------------
>
> Hence, indeed,
>                 !is.null(.getNamespace(pkg))
>
> seems equivalent to
>                  pkg %in% loadedNamespaces()
>
> --- when 'pkg' is of length 1  (!!!)
>
> but is 20 times faster....  and we have
> 11  occurrences  of   ' <...> %in%  loadedNamespaces() '
> in the "base packages" in the R (devel) sources,
>  3 in base,  2 in methods,  3 in stats, 2 in tools, 1 in utils..
>
> On the other hand,
>                  pkg %in% loadedNamespaces()
>
> is extremely nicely readable code, whereas
>                 !is.null(.getNamespace(pkg))
> is pretty much the contrary.
> .. and well readable code is so much easier to maintain etc,
> such that in many cases, code optimization with the cost of
> code obfuscation is *not* desirable.
>
> Of course we could yet again use a few lines of C and R code to
> provide a new R lowlevel function, say
>
>               is.loadedNamespace()
>
> which would be even faster than   !is.null(.getNamespace(pkg))
>
> ...
> ...
>
> but do we have *any* evidence that this would noticably speedup
> any higher level function such as library() ?
>
>
> Thank you, again, Winston; you've opened an interesting topic!
>
> --
> Martin Maechler, ETH Zurich
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel