[Rd] loadNamespace and versionChecking and the otherpackage::otherfun syntax

Geoff Lee geoff.lee99 at gmail.com
Sun Dec 21 01:41:35 CET 2014


This is an enquiry not so much about what the code for loadNamespace does,
but
rather about the intent and design of loadNamespace, and how it interacts
with 
the `::` function, which seems to me to follow  a slightly different
philosophy.

It is not an urgent question - the issue that started me wondering has been
resolved another way - but I would like to complete my understanding of this
aspect of how R's packaging mechanism is meant to operate.

It's also rather a long query - if it's too long please don't waste your
time 
- just ignore it.  To try and make it slightly more digestible 
it is divided into sections, as follows
    SCENARIO
    THE QUERY
    MY OWN ATTEMPTS TO UNDERSTAND
    AN ASIDE ABOUT loadNamespace AND Depends:
    VERSION CHECKS ON otherpackage::otherfun AT LOAD TIME?
    VERSION CHECKS ON otherpackage::otherfun AT EXECUTION TIME?
      POSSIBLE ANSWER 1 - THIS IS TOO COMPLEX AND HYPOTHETICAL
      POSSIBLE ANSWER 2 - THIS IS AN ISSUE FOR THE `::` FUNCTION
      POSSIBLE ANSWER 3 - loadNamespace SHOULD BE VERSION AWARE EVEN WHEN
`::` IS USED

Many thanks in advance for any insights that are able to be offered.

Geoff

SCENARIO
========

The scenario is that `mypackage` uses an `otherpackage` via the the
`otherpackage::otherfun` syntax, and that the version of `otherpackage` must
be 
(say) (>= 2.0).  The constraint on the version of `anotherpackage` is
specified
in the DESCRIPTION file of mypackage using either Imports or Depends.

THE QUERY
=========

The query is :
a) Is it intended that loadNamespace should check the version of
otherpackage
(if one is specified) when it is `loadNamespace`ing mypackage?
and if so
b) Is it intended that the process of loading mypackage should ensure that
the
correct version of otherfun cannot be accidentally masked (for example by
using
.libPaths() to change the search path between the time when mypackage is
loaded,
and the time when the component of mypackage that calls
otherpackage::otherfun 
is executed) ?

MY OWN ATTEMPTS TO UNDERSTAND
=============================

What I've done so far

I've read the  documentation I could find, stepped through loadNamespace
using
debugonce (several times) with toy packages to gains some insight into what
loadNamespace does at the moment (using R3.1.2 on a Windows 64 bit machine),
and read the loadNamespace code a few times (though I can't yet claim to
follow
all of its neat tricks and complexities).

My understanding ths far is that 
a) loadNamespace learns about version constraints on `otherpackage`
dependencies
when it is processing the DESCRIPTION file, viz
        vI <- pkgInfo$Imports
and
b) defers the checking of these dependencies till later, when it is
processing
the NAMESPACE file (to create the imports:mypackage
namespace/environment/frame which 
encloses the namespace:mypackage namespace/environment/frame).  The checking
occurs 
inside 3 loops, which all use an appropriate entry from vI as the
versionCheck 
argument in a recursive call to loadNamespace, viz.   
        for (i in nsInfo$imports) {
         ...etc...
        }
        for (imp in nsInfo$importClasses) ...etc...
        for (imp in nsInfo$importMethods) ...etc...

AN ASIDE ABOUT loadNamespace AND Depends:
=====================================

As an aside it appears that any version dependencies specified in the
Depends 
field are overlooked - I suspect that loadNamespace might be more complete
if 
there were something like 
        vD <- pkgInfo$Depends
        vID <- c(vD, vI) #possibly with an unlist thrown in somewhere?
and the versionCheck used vID instead of just vI

VERSION CHECKS ON otherpackage::otherfun AT LOAD TIME?
===============================================

The problem with this elegant recursive approach to checking the version of
depended upon packages is that (as far as I know thus far) using the
implicit 
loading syntax otherpackage::otherfun does not involve any entry in the
mypackage NAMESPACE file, and hence loadNamespace never checks the version
specification for otherpackage.

Hence the first part of my query - is it intended that loadNamespace perform
such a check?

My initial thought was that the answer should be yes, loadNamespace should
do
such a check, and so I wrote a little function which checked all the
versions
specified in the DESCRIPTION file, at the time the file was initially
encountered.  (After a bit of debugging) it seemed to do what I wanted, in
that 
if the right version of otherpackage was not available, my amended
loadNamespace
threw an error.

But then I started to think about how I could test it thoroughly - by which
I 
mean not does my code do what I think it should do, but does it achieve the
outcome that motivated that coding in the first place.

VERSION CHECKS ON otherpackage::otherfun AT EXECUTION TIME?
====================================================

That led to the second part of my query - should / how could loadNamespace
ensure that I actually get the otherfun from the version of otherpackage
that has been specified in the mypackage DESCRIPTION file, when the 
otherpackage::otherfun code is actually executed?

My understanding is that the underlying intent of the namespace mechanism in

R packages is to ensure that when mypackage calls otherpackage::otherfun, it
is 
indeed otherpackage::otherfun I get, ie I do not get a different function,
also
called otherfun, that for one reason or another exists in memory and is
found
as R works its way up the chain of enclosing environments.  Usually the
concern
is about an identically named but different `otherfun` from `yetanother`
package 
that has been loaded, or that the user has defined themselves in their
globalNamespace.  But in the motivating example for my case, I wanted to
ensure
I got the otherfun from version >= 2.0 of `otherpackage`, and in particular,
I
did **NOT** get `otherpackage(version 1.0)::otherfun`

I confess I haven't actually tried it, but I think that even with my up
front 
checking of the package dependencies mentioned in the DESCRIPTION file
I could probably get the 'wrong' outcome if I changed my .libPath() between 
loading mypackage and executing it.  

This isn't *quite* as unlikely as it might seem - in my real world example I
had
the official CRAN versions of mypackage (actually someone else's package!)
and otherpackage installed in my main library, and was using a development
library to explore changes to updated versions of mypackage and otherpackage
- I
would load mypackage while I had the development library in .libPath(), and
then 
without thinking of all the implications, remove the development
library from .libPath() while doing some exploratory testing using 
(the loaded development version of) mypackage.

Anyway, this lead to the 2nd part of my query - could / should loadNamespace

ensure that at execution time, otherpackage::otherfun actually respects the 
version contraint specified in mypackages DESCRIPTION file?  

POSSIBLE ANSWER 1 - THIS IS TOO COMPLEX AND HYPOTHETICAL
=================================================

One thought was  - this is all getting too complex and hypothetical, there
is
only so much automatic protection that R / loadNamespace can offer, in which

case the answer to query part b) is no.

POSSIBLE ANSWER 2 - THIS IS AN ISSUE FOR THE `::` FUNCTION
================================================

Another thought was, this isn't loadNamespace's problem (it is doing what
its 
name advertises - viz loading namespaces), rather it is something that the
`::` function
should look after.

Looking at the code for `::` it does not seem to have 
provision for specifying a version constraint for the pkg argument.  If this
is
the "correct" approach, the answer to part b) is again no, loadNamespace is 
behaving as designed - but the `::` function should be upgraded so it knows 
about package versioning.  Under this "solution" the mypackage author would
specify the otherpackage version in the same segment of code that calls
otherfun

POSSIBLE ANSWER 3 - loadNamespace SHOULD BE VERSION AWARE EVEN WHEN `::` IS
USED
======================================================================

My most complicated possible answer was yes - loadNamespace should be 
"aware" of calls which use the otherpackage::otherfun syntax, and "enforce"
any 
versioning given in the mypackage DESCRIPTION file, both at load time, by 
checking the version of otherpackage which is available then, and at
execution 
time (by "somehow"" storing a reference to the code for the correct package
and 
version of otherpackage::otherfun in the imports::mypackage namespace). 

The only "somehow"'s I could dream up were messy and hackish - eg 
loadNamespace parses the mypackage code to find `otherpackage::otherfun`
calls, 
loads the otherpackage::otherfun code, and inserts a special mangled
reference
name into the imports:mypackage namespace environment, AND
`::` is changed to look for a mangled name version of package before it runs

as it currently does.

And that feels so inelegant I decided I am approaching this / not
understanding properly,
 so I decided to stop exploring this myself (very instructional though
that has been) and pose this query instead.



More information about the R-devel mailing list