[Rd] Duplicated mirrors on available packages

Colin Gillespie c@g|||e@p|e @end|ng |rom gm@||@com
Mon Sep 12 11:08:38 CEST 2022


The use case came from the rig application
(https://github.com/r-lib/rig). Rig (I think) inserts the RStudio
package manager into the list of repos. This can cause duplication in
repos, hence the current issue.
Now that I know the reason, I can work around it.



On Mon, 12 Sept 2022 at 09:57, Maxim Nazarov
<maxim.nazarov using openanalytics.eu> wrote:
>
> If you profile the second run, you will see that the majority of the time is spent in the `tools:::.remove_stale_dups` function, which loops over all duplicated packages - so all packages in that case.
> One improvement I could think of is to replace the first line of that function
>     pkgs <- ap[, "Package"]
> with
>     pkgs <- ap[!duplicated(ap[, c("Package", "Version")]), "Package"]
> which would help in your example, but the function might still run longer if there are many packages with different versions present, so there maybe even better optimizations.
>
> Stating the obvious here, but since we don't know your 'real' use case, adding a `unique` call to the `repos` argument of the `available.packages` would achieve a similar improvement without any modifications needed from `tools`.
>
> Kind regards,
> Maxim Nazarov
>
> ----- Original Message -----
> From: "Colin Gillespie" <csgillespie using gmail.com>
> To: "r-devel" <r-devel using r-project.org>
> Sent: Friday, September 9, 2022 7:33:09 PM
> Subject: [Rd] Duplicated mirrors on available packages
>
> Hi
>
> When there are duplicated repos, available.packages() takes
> significantly longer to run.
>
> For example
>
> mirror = "https://cloud.r-project.org/"
> system.time(available.packages(repos = mirror))
> #   user  system elapsed
> # 1.054   0.031   1.245
> system.time(available.packages(repos = c(mirror, mirror)))
> #   user  system elapsed
> # 22.389   0.037  22.429
>
> Best wishes,
>
> Colin
>
>
> > sessionInfo()
> R version 4.2.0 (2022-04-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 22.04.1 LTS
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
>
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.2.0 tools_4.2.0
>
>
> Dr Colin Gillespie
> https://twitter.com/csgillespie
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list