[R] Which CRAN mirror is the fastest ?

spencerg spencer.graves at prodsyse.com
Thu Jul 30 12:04:33 CEST 2009


      There may also be a difference in reliability, which would not so 
easily be measured by an individual user.  I've selected the closest 
geographically until it seemed to be down, then tried the second 
closest, etc.  This could be automated centrally, but then you'd have to 
deal with the human factor of how to turn the data into commentary to 
the people who volunteer to provide hardware and support without 
offending them. 


      Spencer


Martin Maechler wrote:
>>>>>> Barry Rowlingson <b.rowlingson at lancaster.ac.uk>
>>>>>>     on Thu, 30 Jul 2009 09:59:47 +0100 writes:
>>>>>>             
>
>     > 2009/7/30 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
>     >> Hard to lee, you have to try out, I fear.
>     >> 
>     >> The speed you see highly depends on the connection from your country to
>     >> others, but of course, there are also some mirrors that are not the fastest
>     >> themselves.
>
>     > I figured you could write a function that got the CRAN mirror list and
>     > tested their response. Here's my 'cranometer':
>
>     > cranometer <- function(ms = getCRANmirrors(all = FALSE, local.only = FALSE)){
>
>     > dest = tempfile()
>
>     > nms = dim(ms)[1]
>     > ms$t = rep(NA,nms)
>     > for(i in 1:nms){
>     > m = ms[i,]
>     > url = paste(m$URL,"/src/base/NEWS",sep="")
>     > t = try(system.time(download.file(url,dest),gcFirst=TRUE))
>     > if(file.exists(dest)){
>     > file.remove(dest)
>     > ms$t[i]=t['elapsed']
>     > }else{
>     > ms$t[i]=NA
>     > }
>     > }
>     > return(ms)
>     > }
>
>     > It works by downloading the latest NEWS file (376Kbytes at the
>     > moment, so not huge) from each of the mirror sites in the CRAN mirrors
>     > list. If you want to test it on a subset then call getCRANmirrors
>     > yourself and subset it somehow.
>
>     > I'm running it now on the full CRAN list and I've yet to find a
>     > timeout or error so I'm not sure what will happen if download.file
>     > fails. It retuns a data frame like you get from getCRANmirrors but
>     > with an extra 't' column giving the elapsed time to get the NEWS file.
>
>     > CAVEATS: if your network has any local caching then these results
>     > will be wrong, since your computer will probably be getting the
>     > locally cached NEWS file and not the one on the server. Especially if
>     > you run it twice. Oh, I should have put cacheOK=FALSE in the
>     > download.file - but even that might get overruled somewhere. Also,
>     > sites may have good days and bad days, good minutes and bad minutes,
>     > your network may be congested on a short-term basis, etc etc.
>
>     > Other ideas: how about combining the CRAN list with my geonames
>     > package to work out distances from where you are to the CRAN site? I
>     > might write that later if I get a minute...
>
> Yes!  And visualize the corresponding  "nearest neigbourhood"
> for each CRAN mirror on a world map
> and make this dynamically refreshing every few minutes 
> and put it on a webserver so people can watch the "CRAN world"
> in real time!  
>
> More seriously, it would be really cool if a "robust" version of
> cranometer() could be used automagically in the (typical /
> default) case of install.packages() {and it's call from the
> Windows (or also Mac?) 'Packages' menu} when the user / site
> have no CRAN repository specified:
> It would choose the CRAN mirror which is closest,
> or even better (and more appropriate for a statistics software),
> would chose one at random, but with probability inversely
> proportional to (a power of ?) the "distance".
>
> ... yes, we should defer this  from R-help to  R-devel ..
>
> Martin
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list