[Rd] improving the performance of install.packages

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Sat Nov 9 00:05:45 CET 2019


Hi Gabe,

Keeping track of where a package was installed from would be a nice 
feature. However it wouldn't be as reliable as comparing hashes to 
decide whether a package needs re-installation or not.

H.

On 11/8/19 12:37, Gabriel Becker wrote:
> Hi Josh,
> 
> There are a few issues I can think of with this. The primary one is that
> CRAN(/Bioconductor) is not the only place one can install packages from. I
> might have version x.y.z of a package installed that was, at the time, a
> development version I got from github, or installed locally, etc. Hell I
> might have a later devel version but want the CRAN version. Not common,
> sure, but wiill likely happen often enough that install.packages not doing
> that for me when I tell it to is probably bad.
> 
> Currently (though there has been some discussion of changing this) packages
> do not remember where they were installed from, so R wouldn't know if the
> version you have is actually fully the same one on the repository you
> pointed install.packages to or not.  If that were changed  and we knew that
> we were getting the byte identical package from the actual same source, I
> think this would be a nice addition, though without it I think it would be
> right a high but not high enough proportion of the time.
> 
> R will build the package from source (depending on what OS you're using)
>> twice by default. This becomes especially burdensome when people are using
>> big packages (i.e. lots of depends) and someone has a script with:
>>
> 
> 
> install.packages("tidyverse")
>> ...
>> ... later on down the script
>> ...
>> install.packages("dplyr")
>>
> 
> I mean, IMHO and as I think Duncan was alluding to, that's straight up an
> error by the script author. I think its a few of them, actually, but its at
> least one. An understandable one, sure, but thats still what it is. Scripts
> (which are meant to be run more than once, generally) usually shouldn't
> really be calling install.packages in the first place, but if they do, they
> should certainly not be installing umbrella packages and the packages they
> bring with them separately.
> 
> Even having one vectorized call to install.packages where all the packages
> are installed would prevent this issue, including in the case where the
> user doesn't understand the purpose of the tidyverse package. Though the
> installation would still occur every time the script was run.
> 
> 
> The last thing to note is that there are at least 2 packages which provide
> a function which does this already (install.load and remotes), so people
> can get this functionality if they need it.
> 
> 
> On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 using gmail.com> wrote:
> 
>>
>>
>> I assumed this list is used to discuss proposals like this to the R
>> codebase. If I'm on the wrong list, please let me know.
>>
> 
> This is the right place to discuss things like this. Thanks for starting
> the conversation.
> 
> Best,
> ~G
> 
>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=XG4gVQKZam41YLfI3w8XRAu8s7f2I5jCppA45q6NBu0&s=cOXQGMA9Va3o9x1USGggzF82D1LtFQb2ALpLRLQs2k4&e=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319


More information about the R-devel mailing list