[R] Permutations

Sat Nov 19 10:05:42 CET 2011

You are correct, Michael! I had (rashly) assumed that Gyanendra's
error message was the result of asking for the factorial of a
number which was too large, without checking it against the limits.
I then embarked on a discussion of large-nymber problems in R as
relevant to his type of query.

My final section, pointing out the limits as ascertainable in R,
in fact (as you point out) shows that factorial(150) is not too
large a number; but I did not cross-check with the original query.
Apologies for that.

In fact, the function factorial() copes perfectly well with his
problem (as I understood it -- see below):

  print(factorial(150),22)
  # [1] 5.713383956445705e+262
## R will not allow printing to more than 22 digits, whereas you
## really need 262 digits here! Also, you still only see 16.

  print(factorial(150)/(factorial(10)*factorial(140)),18)
  # [1] 1169554298222310
## Which is the correct answer (as determined previously)

So it would seem that Gyanendra's error message arose when
evaluating a different expression, suggesting that perhaps
the problem he was trying to solve was not what it seemed
to be from his original query:

  "I have to calculate the number of subset formed by 150
   samples taking 10 at a time."

I had (see my original reply) assumed that what he needed
was the number of ways of choosing subsets of size 10 out
of a set of size 150. But now it is clear that this should
not give rise to that error message,

So, Gyanendra, what *exactly* is the question? Then it would
perhaps be possible to locte the source of the error message.
Or, if the question was indeed as I originally assumed it
to be, what was the expression which you tried to evaluate?

Can you post the commands which you used, and the output?
Ted.

On 19-Nov-11 01:25:02, R. Michael Weylandt wrote:
> What build of R can't calculate factorial(150)? I thought the max
> double of a 32 bit build would be on the order of 10^308 ~ 2^1024 (the
> material below seems to agree with me on this)
> 
> But yes, agreed with Ted: it's very helpful to think a bit about what
> you are calculating when doing these sorts of large number
> calculations before asking for the accurate calculation of huge
> numbers.
> 
> Michael
> 
> On Fri, Nov 18, 2011 at 4:31 PM, Ted Harding <ted.harding at wlandres.net>
> wrote:
>> On 18-Nov-11 16:03:44, Gyanendra Pokharel wrote:
>>> Hi all,
>>> why factorial(150) shows the error out of range in 'gammafn'?
>>> I have to calculate the number of subset formed by 150 samples
>>> taking 10 at a time. How is this possible?
>>> best
>>
>> Because factorial(150) is nearly 10^263, which is far greater
>> than any number that R is capable of storing.
>>
>> I would, perhaps, suggest you work with lgamma(), the logarithm
>> (to base e) of the Gamma function.
>>
>> Thus factorial(150) = gamma(151), and
>>
>> _lgamma(151)
>> _# [1] 605.0201
>>
>> so factorial(150) is close to e^605; to get it as a power of 10:
>>
>> _lgamma(151)*log10(exp(1))
>> _# [1] 262.7569
>>
>>
>> Hence the "nearly 10^263" above.
>>
>> If your question means "the number of ways of choosing 10 out
>> of 150, then, using lgamma(), its log to base e is:
>>
>> _lgamma(151) - lgamma(11) - lgamma(141)
>> _# [1] 34.6954
>>
>> and its log to base 10 is:
>>
>> _(lgamma(151) - lgamma(11) - lgamma(141))*log10(exp(1))
>> _# [1] 15.06802
>>
>> so the result is about 10^15 which *is* within R's range.
>>
>> Hence:
>>
>> _10^( (lgamma(151) - lgamma(11) - lgamma(141))*log10(exp(1)) )
>> _# 1.169554e+15
>>
>> You can see this (approximately) in an all-digits form by
>> using print():
>>
>> _X <- 10^( (lgamma(151) - lgamma(11) - lgamma(141))*log10(exp(1)) )
>> _print(X,18)
>> _# [1] 1169554298222353
>>
>> However, in many cases (such as your example) it will be feasible
>> to exploit the reduction in numbers of factors if you cancel out
>> the factors in the larger denominator factorial from the numerator
>> factorial, so that the number you want is
>>
>> _150*149*148*147*146*145*144*143*142*141/(10*9*8*7*6*5*4*3*2*1)
>>
>> = (150/10)*(149/9)*(148/8)* ... *(142/2)*(141/1)
>>
>> for which you can define a function nCr(n,r):
>>
>> _nCr <- function(n,r){
>> _ _num <- (n:(n-r+1))
>> _ _den <- (r:1)
>> _ _prod(num/den)
>> _}
>>
>> and then:
>>
>> _print(nCr(150,10),18)
>> _# [1] 1169554298222310
>>
>> which is not quite the same as the result 1169554298222353
>> obtained above using lgamma(). This is because the previous
>> result is affected by rounding errors occuring in the
>> logarithms. The result obtained using the function nCr() is
>> exact. However, it can fail in its turn if you want one of
>> the larger possible results, such as nCr(1000,500):
>>
>> _nCr(1500,750)
>> _# [1] Inf
>>
>> since the result is too large for R to store (the largest on
>> my machine is about 2*10^308, see below)), and the products
>> in the function definition accumulate until this number is
>> exceeded.
>>
>> You can find out the limits of R storage on your machine
>> by using the command
>>
>> _.Machine
>>
>> which gives a long list of the characteristics of the machine
>> you are using. In particular, you will see:
>>
>> _$double.xmax
>> _[1] 1.797693e+308
>>
>> is the largest number that R will store; and:
>>
>> _$double.eps
>> _[1] 2.220446e-16
>>
>> which is the smallest positive number; and:
>>
>> _$double.neg.eps
>> _[1] 1.110223e-16
>>
>> which is the smallest negative number (note that the latter
>> is half the former).
>>
>> They can be individually accessed as:
>>
>> _.Machine$double.xmax
>> _# [1] 1.797693e+308
>>
>> _.Machine$double.eps
>> _# [1] 2.220446e-16
>>
>> _.Machine$double.neg.eps
>> _# [1] 1.110223e-16
>>
>> Hoping this helps!
>> Ted.
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <ted.harding at wlandres.net>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 18-Nov-11 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Time: 21:31:04
>> ------------------------------ XFMail ------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 19-Nov-11                                       Time: 09:05:37
------------------------------ XFMail ------------------------------