[Rd] sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31

Martin Maechler maechler at stat.math.ethz.ch
Tue Jun 6 09:45:44 CEST 2017


>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>>     on Fri, 2 Jun 2017 04:05:15 -0700 writes:

    > Hi, I have a long numeric vector 'xx' and I want to use
    > sum() to count the number of elements that satisfy some
    > criteria like non-zero values or values lower than a
    > certain threshold etc...

    > The problem is: sum() returns an NA (with a warning) if
    > the count is greater than 2^31. For example:

    >> xx <- runif(3e9) sum(xx < 0.9)
    >    [1] NA Warning message: In sum(xx < 0.9) : integer
    > overflow - use sum(as.numeric(.))

    > This already takes a long time and doing
    > sum(as.numeric(.)) would take even longer and require
    > allocation of 24Gb of memory just to store an intermediate
    > numeric vector made of 0s and 1s. Plus, having to do
    > sum(as.numeric(.)) every time I need to count things is
    > not convenient and is easy to forget.

    > It seems that sum() on a logical vector could be modified
    > to return the count as a double when it cannot be
    > represented as an integer.  Note that length() already
    > does this so that wouldn't create a precedent. Also and
    > FWIW prod() avoids the problem by always returning a
    > double, whatever the type of the input is (except on a
    > complex vector).

    > I can provide a patch if this change sounds reasonable.

This sounds very reasonable,  thank you Hervé, for the report,
and even more for a (small) patch.

Martin

    > Cheers, H.

    > -- 
    > Hervé Pagès

    > Program in Computational Biology Division of Public Health
    > Sciences Fred Hutchinson Cancer Research Center 1100
    > Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA
    > 98109-1024

    > E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax:
    > (206) 667-1319

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list