[Rd] Proposal to limit Internet access during package load

Iñaki Ucar |uc@r @end|ng |rom |edor@project@org
Fri Sep 23 17:22:49 CEST 2022


Hi all,

I'd like to open this debate here, because IMO this is a big issue.
Many packages do this for various reasons, some more legitimate than
others, but I think that this shouldn't be allowed, because it
basically means that installation fails in a machine without Internet
access (which happens e.g. in Linux distro builders for security
reasons).

Now, what if connection is suppressed during package load? There are
basically three use cases out there:

(1) The package requires additional files for the installation (e.g.
the source code of an external library) that cannot be bundled into
the package due to CRAN restrictions (size).
(2) The package requires additional files for using it (e.g.,
datasets, a JAR...) that cannot be bundled into the package due to
CRAN restrictions (size).
(3) Other spurious reasons (e.g. the maintainer decided that package
load was a good place to check an online service availability, etc.).

Again IMO, (3) shouldn't be allowed in any case; (2) should be a
separate function that the user actively calls to download the files,
and those files should be placed into the user dir, and (3) is the
only legitimate use, but then other mechanism should be provided to
avoid connections during package load.

My proposal to support (3) would be to add a new field in the
DESCRIPTION, "Additional_sources", which would be a comma separated
list of additional resources to download during R CMD INSTALL. Those
sources would be downloaded by R CMD INSTALL if not provided via an
option (to support offline installations), and would be placed in a
predefined place for the package to find and configure them (via an
environment variable or in a predefined subdirectory).

This proposal has several advantages. Apart from the obvious one
(Internet access during package load can be limited without losing
current functionalities), it gives more visibility to the resources
that packages are using during the installation phase, and thus makes
those installations more reproducible and more secure.

Best,
-- 
Iñaki Úcar



More information about the R-devel mailing list