Evidence from Debian's package tracking (Was Re: [R] Size of R user base.)

Ajay Shah ajayshah at mayin.org
Thu Apr 22 07:34:40 CEST 2004

I have watched the discussions about the size of the R user base with
much interest. One more source of data that might help is the
voluntary data capture in Debian. If you are a Debian user, you should
volunteer information. It's very easy: as root, say:

      # apt-get install popularity-contest

The results are found at:


This shows that of the 4800 people who volunteered information, 1631
had installed gnuplot -- which suggests that perhaps one third of
Debian installs are by numerate people. R-base was installed by
roughly one-tenth of the sample.

So that's one useful fact: Roughly one in ten of Debian users is an R
user. Roughly one in three of the numerate users is an R user.

I would take this one-in-ten fact quite seriously, except for the
extent to which which R users are perhaps more likely (as compared
with the population) to volunteer information about what packages they

Now let's engage in some wild guesswork.

* It is believed that there are roughly 2e7 desktops in the world
  today, running a freeware Unix system.

* Debian is undoubtedly a biased source of data, in having the more
  geeky users. Let's knock off a factor of 10 in order to correct for

* If we think that 1% of all freeware Unix users are R users, then we
  get to an estimate of 200,000 users of R in the freeware Unix
  world. There would be more using Mac OS X, Solaris, etc.

Google data shows that 1% of google hits are from Linux while 4% are
from Mac users. So for each Linux user, there are 4 Mac OS X
users. But then, a lot of them are Aunt Tillie, and are unlikely to
need anything more than a calculator.

Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi

More information about the R-help mailing list