[Rd] CRAN Server download statistics (Was: R Usage Statistics)

hadley wickham h.wickham at gmail.com
Mon Nov 23 15:48:01 CET 2009


> Knowing what percentage of different OSes are being used is of
> interest to package developers and would be obscured by the proposal
> to massage the data.  I prefer to see the raw figure as is.

I agree.  I was arguing that sorting by that value wasn't very useful.

> Also the number of IPs are important and should not be removed in my
> opinion since (1) it is a measure of clustering.  If a package is
> mainly used by the courses of a few universities where the students
> really have no choice then that seems a lot different than if its used
> by a variety of people around the world.  Only the IPs would give any
> clue to that.  (2) it helps to diagnose intentional distortion of the
> figures by repeat downloads to the same machine.

There is no way to tease apart (1) and (2), plus many adsl providers
share an ip across multiple subscribers.  Number of unique IPs may
still be useful, but it needs to be used with caution.

> The one problem with sparkline graphs is that it would take a lot
> longer for the page to load.  There already is a time series if you
> click on the package name.

Is it a time series?  It looks like a bar chart of downloads per day
of week to me.

Hadley

-- 
http://had.co.nz/



More information about the R-devel mailing list