[Rd] improving the performance of install.packages

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Sat Nov 9 00:07:39 CET 2019


On 08/11/2019 6:02 p.m., William Dunlap wrote:
> Suppose update.packages("pkg") installed "pkg" if it were not already 
> installed, in addition to its current behavior of installing "pkg" if 
> "pkg" is installed but a newer version is available.  The OP could then 
> use update.packages() all the time instead of install.packages() the 
> first time and update.packages() subsequent times.

That makes more sense to me than the "force = FALSE" proposal.

Duncan Murdoch

> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com <http://tibco.com>
> 
> 
> On Fri, Nov 8, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan using gmail.com 
> <mailto:murdoch.duncan using gmail.com>> wrote:
> 
>     On 08/11/2019 2:55 p.m., Joshua Bradley wrote:
>      > I could do this...and I have before. This brings up a more
>     fundamental
>      > question though. You're asking me to write code that changes the
>     logic of
>      > the installation process (i.e. writing my own package installer).
>     Instead
>      > of doing that, I would rather integrate that logic into R itself
>     to improve
>      > the baseline installation process. This api proposal change would be
>      > additive and would not break legacy code.
> 
>     That's not true.  The current behaviour is equivalent to force=TRUE; I
>     believe the proposal was to change the default to force=FALSE.
> 
>     If you didn't change the default, it wouldn't help your example:  the
>     badly written script would run with force=TRUE, and wouldn't benefit
>     at all.
> 
>     Duncan Murdoch
> 
>      >
>      > Package managers like pip (python), conda (python), yum (CentOS), apt
>      > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their
>      > defaults) when to not download a package again. By proposing this
>     change,
>      > I'm essentially asking that R follow some of the same conventions
>     and best
>      > practices that other package managers have adopted over the decades.
>      >
>      > I assumed this list is used to discuss proposals like this to the R
>      > codebase. If I'm on the wrong list, please let me know.
>      >
>      > P.S. if this change happened, it would be interesting to study
>     the effect
>      > it has on the bandwidth across all CRAN mirrors. A significant
>     drop would
>      > turn into actual $$ saved
>      >
>      > Josh Bradley
>      >
>      >
>      > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch
>     <murdoch.duncan using gmail.com <mailto:murdoch.duncan using gmail.com>>
>      > wrote:
>      >
>      >> On 08/11/2019 2:06 a.m., Joshua Bradley wrote:
>      >>> Hello,
>      >>>
>      >>> Currently if you install a package twice:
>      >>>
>      >>> install.packages("testit")
>      >>> install.packages("testit")
>      >>>
>      >>> R will build the package from source (depending on what OS
>     you're using)
>      >>> twice by default. This becomes especially burdensome when
>     people are
>      >> using
>      >>> big packages (i.e. lots of depends) and someone has a script with:
>      >>>
>      >>> install.packages("tidyverse")
>      >>> ...
>      >>> ... later on down the script
>      >>> ...
>      >>> install.packages("dplyr")
>      >>>
>      >>> In this case, "dplyr" is part of the tidyverse and will install
>     twice. As
>      >>> the primary "package manager" for R, it should not install a
>     package
>      >> twice
>      >>> (by default) when it can be so easily checked. Indeed, many
>     people resort
>      >>> to writing a few lines of code to filter out already-installed
>     packages
>      >> An
>      >>> r-help post from 2010 proposed a solution to improving the default
>      >>> behavior, by adding "force=FALSE" as a api addition to
>     install.packages.(
>      >>> https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html)
>      >>>
>      >>> Would the R-core devs still consider this proposal?
>      >>
>      >> Whether or not they'd do it, it's easy for you to do it.
>      >>
>      >> install.packages <- function(pkgs, ..., force = FALSE) {
>      >>     if (!force) {
>      >>       pkgs <- Filter(Negate(requireNamespace), pkgs
>      >>
>      >>     utils::install.packages(pkgs, ...)
>      >> }
>      >>
>      >> You might want to make this more elaborate, e.g. doing
>     update.packages()
>      >> on the ones that exist.  But really, isn't the problem with the
>     script
>      >> you're using, which could have done a simple test before forcing
>     a slow
>      >> install?
>      >>
>      >> Duncan Murdoch
>      >>
>      >
>      >       [[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>      >
> 
>     ______________________________________________
>     R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list