[Rd] no visible binding for global variable for data sets in a package

Martin Maechler maechler at stat.math.ethz.ch
Thu Aug 28 10:50:44 CEST 2014


>>>>> peter dalgaard <pdalgd at gmail.com>
>>>>>     on Wed, 27 Aug 2014 21:09:47 +0200 writes:

    > On 27 Aug 2014, at 16:48 , Hadley Wickham <h.wickham at gmail.com> wrote:

    >>> I think the right answer _is_ to export the lazy data; the question is how to do it. There's nothing particularly strange about exporting non-functions ("letters" would be an example, save for the special status of package:base). If you attach the package, the lazyloaded data appear in the same environment as the exported function so they are de facto already in the namespace for the purposes of library() and `::`. So I agree, something like exportData() would be useful. (Or some other mechanism. You might want to be able to export data selectively.)
    >> 
    >> I don't think lazyloaded data are in the same environment as exported
    >> functions - getExportedValue() (called by ::) looks first in the
    >> "exports" namespace, then in the "lazydata" namespace:
    >> 
    >> function (ns, name)
    >> {
    >> getInternalExportName <- function(name, ns) {
    >> exports <- getNamespaceInfo(ns, "exports")
    >> if (exists(name, envir = exports, inherits = FALSE))
    >> get(get(name, envir = exports, inherits = FALSE),
    >> envir = ns)
    >> else {
    >> ld <- getNamespaceInfo(ns, "lazydata")
    >> if (exists(name, envir = ld, inherits = FALSE))
    >> get(name, envir = ld, inherits = FALSE)
    >> else stop(gettextf("'%s' is not an exported object from
    >> 'namespace:%s'",
    >> name, getNamespaceName(ns)), call. = FALSE, domain = NA)
    >> }
    >> }
    >> ns <- asNamespace(ns)
    >> if (isBaseNamespace(ns))
    >> get(name, envir = ns, inherits = FALSE)
    >> else getInternalExportName(name, ns)
    >> }
    >> 
    >> 
    >> (But maybe you just meant the library() and :: behaves as is lazydata
    >> and exports were the same thing)
    >> 
    >> Hadley
    >> 
    >> -- 
    >> http://had.co.nz/

    > I meant that 

    > a) :: gives results as if the data was in the namespace

    > b) if you do

    >> library(MASS)
    >> ls("package:MASS")
    > [1] "abbey"              "accdeaths"          "addterm"           
    > [4] "Aids2"              "Animals"            "anorexia"          
    > [7] "area"               "as.fractions"       "bacteria"          
    > [10] "bandwidth.nrd"      "bcv"                "beav1"             
    > ....

    > data and functions get put together in the same environment (however, this is not the namespace environment, see later).

    > What puzzles me is why the distinction between lazydata and exports was there to begin with. 

    > The implication of the current setup is clearly that pkg::foo() cannot access package::dat by referring to `dat` whereas it can do so if invoked as library(pkg); foo(). 

    > We also have

    >> get("accdeaths", environment(MASS::addterm))
    > Error in get("accdeaths", environment(MASS::addterm)) : 
    > object 'accdeaths' not found
    >> library(MASS)
    >> get("accdeaths", environment(MASS::addterm))
    > Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
    > 1973  9007  8106  8928  9137 10017 10826 11317 10744  9713  9938  9161
    > 1974  7750  6981  8038  8422  8714  9512 10120  9823  8743  9129  8710
    > ...

    > which confused me at first, but it actually just means that "accdeaths" is found on the search path in the latter case. This strikes me as somewhat dangerous: If a package uses one of its own datasets, it can be masked by a later attach() or the global env. 

    > (I suspect that someone already explained all this a while back, but I just wasn't listening at the time...)

I did say (as first reply in this thread) that the lazyloaded
datasets where in the package environment, but not
in the package's *namespace* environment -- which is somewhat
exceptional, as otherwise, the namespace envir is typically a superset
of the package envir, and I did mention the solution with
sysdata.rda + NAMESPACE which Brian and Hadley later mentioned, too.
However Michael did not get it well enough (because you
need to read other material), and I do think that is not an
ideal solution for data sets that have help pages, etc; notably
because creating sysdata.rda is not at all part of the package
sources, and hence harder for maintenance and "open source transparency".

The one additional important in the thread was the special
semantic of '::'  which here returns something for
 <pkg>::<obj> which is *not* part of the <pkg>'s namespace,
whereas the use of '::' does suggest to Joe Average to be
working with namespace (as opposed to package) contents.
I agree with your suggestion, Peter, that this looks more
confusing than it should, and ideally, we'd find a better setup.

OTOH, if we additionally allowed something like   exportData(),
we would additionally get data that is both in the package and
namespace environments and other (not exported) data that is
only in the package, so add room for even more confusion.

Martin



More information about the R-devel mailing list