[Rd] Duplicated mirrors on available packages

Kurt Hornik Kurt@Horn|k @end|ng |rom wu@@c@@t
Tue Sep 13 14:02:04 CEST 2022


>>>>> Colin Gillespie writes:

> The use case came from the rig application
> (https://github.com/r-lib/rig). Rig (I think) inserts the RStudio
> package manager into the list of repos. This can cause duplication in
> repos, hence the current issue.
> Now that I know the reason, I can work around it.

Thanks for reporting this issue.  I just changed available.packages() a
la

-    for(repos in contriburl) {
+    for(repos in unique(contriburl)) {

which avoids the full duplication.

Best
-k

> On Mon, 12 Sept 2022 at 09:57, Maxim Nazarov
> <maxim.nazarov using openanalytics.eu> wrote:
>> 
>> If you profile the second run, you will see that the majority of the time is spent in the `tools:::.remove_stale_dups` function, which loops over all duplicated packages - so all packages in that case.
>> One improvement I could think of is to replace the first line of that function
>> pkgs <- ap[, "Package"]
>> with
>> pkgs <- ap[!duplicated(ap[, c("Package", "Version")]), "Package"]
>> which would help in your example, but the function might still run longer if there are many packages with different versions present, so there maybe even better optimizations.
>> 
>> Stating the obvious here, but since we don't know your 'real' use case, adding a `unique` call to the `repos` argument of the `available.packages` would achieve a similar improvement without any modifications needed from `tools`.
>> 
>> Kind regards,
>> Maxim Nazarov
>> 
>> ----- Original Message -----
>> From: "Colin Gillespie" <csgillespie using gmail.com>
>> To: "r-devel" <r-devel using r-project.org>
>> Sent: Friday, September 9, 2022 7:33:09 PM
>> Subject: [Rd] Duplicated mirrors on available packages
>> 
>> Hi
>> 
>> When there are duplicated repos, available.packages() takes
>> significantly longer to run.
>> 
>> For example
>> 
>> mirror = "https://cloud.r-project.org/"
>> system.time(available.packages(repos = mirror))
>> #   user  system elapsed
>> # 1.054   0.031   1.245
>> system.time(available.packages(repos = c(mirror, mirror)))
>> #   user  system elapsed
>> # 22.389   0.037  22.429
>> 
>> Best wishes,
>> 
>> Colin
>> 
>> 
>> > sessionInfo()
>> R version 4.2.0 (2022-04-22)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 22.04.1 LTS
>> 
>> Matrix products: default
>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
>> 
>> locale:
>> [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>> [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>> [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>> [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> loaded via a namespace (and not attached):
>> [1] compiler_4.2.0 tools_4.2.0
>> 
>> 
>> Dr Colin Gillespie
>> https://twitter.com/csgillespie
>> 
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list