[Rd] improving the performance of install.packages

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Fri Nov 8 21:12:25 CET 2019


I guess you would just use force=TRUE

H.

On 11/8/19 12:06, William Dunlap via R-devel wrote:
> While developing a package, I often run install.packages() on it many times
> in a session without updating its version number.  How would your proposed
> change affect this workflow?
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> 
> On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 using gmail.com> wrote:
> 
>> I could do this...and I have before. This brings up a more fundamental
>> question though. You're asking me to write code that changes the logic of
>> the installation process (i.e. writing my own package installer). Instead
>> of doing that, I would rather integrate that logic into R itself to improve
>> the baseline installation process. This api proposal change would be
>> additive and would not break legacy code.
>>
>> Package managers like pip (python), conda (python), yum (CentOS), apt
>> (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their
>> defaults) when to not download a package again. By proposing this change,
>> I'm essentially asking that R follow some of the same conventions and best
>> practices that other package managers have adopted over the decades.
>>
>> I assumed this list is used to discuss proposals like this to the R
>> codebase. If I'm on the wrong list, please let me know.
>>
>> P.S. if this change happened, it would be interesting to study the effect
>> it has on the bandwidth across all CRAN mirrors. A significant drop would
>> turn into actual $$ saved
>>
>> Josh Bradley
>>
>>
>> On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan using gmail.com>
>> wrote:
>>
>>> On 08/11/2019 2:06 a.m., Joshua Bradley wrote:
>>>> Hello,
>>>>
>>>> Currently if you install a package twice:
>>>>
>>>> install.packages("testit")
>>>> install.packages("testit")
>>>>
>>>> R will build the package from source (depending on what OS you're
>> using)
>>>> twice by default. This becomes especially burdensome when people are
>>> using
>>>> big packages (i.e. lots of depends) and someone has a script with:
>>>>
>>>> install.packages("tidyverse")
>>>> ...
>>>> ... later on down the script
>>>> ...
>>>> install.packages("dplyr")
>>>>
>>>> In this case, "dplyr" is part of the tidyverse and will install twice.
>> As
>>>> the primary "package manager" for R, it should not install a package
>>> twice
>>>> (by default) when it can be so easily checked. Indeed, many people
>> resort
>>>> to writing a few lines of code to filter out already-installed packages
>>> An
>>>> r-help post from 2010 proposed a solution to improving the default
>>>> behavior, by adding "force=FALSE" as a api addition to
>> install.packages.(
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Dhelp_2010-2DMay_239492.html&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=R1s-MHqzxEbvj-KerylYVqz-IkWatde6QREua4MPqmU&e= )
>>>>
>>>> Would the R-core devs still consider this proposal?
>>>
>>> Whether or not they'd do it, it's easy for you to do it.
>>>
>>> install.packages <- function(pkgs, ..., force = FALSE) {
>>>     if (!force) {
>>>       pkgs <- Filter(Negate(requireNamespace), pkgs
>>>
>>>     utils::install.packages(pkgs, ...)
>>> }
>>>
>>> You might want to make this more elaborate, e.g. doing update.packages()
>>> on the ones that exist.  But really, isn't the problem with the script
>>> you're using, which could have done a simple test before forcing a slow
>>> install?
>>>
>>> Duncan Murdoch
>>>
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=mIZ0fcjSg7KaJAY4wgLlKOaWwcD2uv9lI-GQNvcj4cg&e=
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=mIZ0fcjSg7KaJAY4wgLlKOaWwcD2uv9lI-GQNvcj4cg&e=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319


More information about the R-devel mailing list