[Rd] sum() vs cumsum() implicit type coercion

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Aug 25 12:25:01 CEST 2020


>>>>> Tomas Kalibera 
>>>>>     on Tue, 25 Aug 2020 09:29:05 +0200 writes:

    > On 8/23/20 5:02 PM, Rory Winston wrote:
    >> Hi
    >> 
    >> I noticed a small inconsistency when using sum() vs cumsum()
    >> 
    >> I have a char-based series
    >> 
    >> > tryjpy$long
    >> 
    >> [1] "0.0022"  "-0.0002" "-0.0149" "-0.0023" "-0.0342" "-0.0245" "-0.0022"
    >> 
    >> [8] "0.0003"  "-0.0001" "-0.0004" "-0.0036" "-0.001"  "-0.0011" "-0.0012"
    >> 
    >> [15] "-0.0006" "0.0016"  "0.0006"
    >> 
    >> When I run sum() vs cumsum() , sum fails but cumsum converts the
    >> series to numeric before summing:
    >> 
    >>> sum(tryjpy$long)
    >> Error in sum(tryjpy$long) : invalid 'type' (character) of argument
    >> 
    >>> cumsum(tryjpy$long)
    >> [1]  0.0022  0.0020 -0.0129 -0.0152 -0.0494 -0.0739 -0.0761 -0.0758 -0.0759
    >> [10] -0.0763 -0.0799 -0.0809 -0.0820 -0.0832 -0.0838 -0.0822 -0.0816
    >> 
    >> Which I guess is due to the following line in do_cum():
    >> 
    >> PROTECT(t = coerceVector(CAR(args), REALSXP));
    >> This might be fine and there may be very good reasons why there is no
    >> coercion in sum - just seems a little inconsistent in usage

    > Yes. I don't know the reason for this design, but please note it is 
    > documented in ?sum and in ?cumsum, which would also make it harder to 
    > change. One can always use a consistent subset (not rely on the coercion 
    > e.g. from characters).

    > Best
    > Tomas

Indeed.
Further note that most arithmetic/math  *fails* on
character vectors, so if a change would have to be made, it
should rather be such that cumsum() also rejects character
input.

We would have consistency then, but potentially break user code,
even package code which has hitherto assumed cumsum() to coerce
to numeric first.

If a majority of commentators and R core thinks we should make
such a change, I'd agree to consider it.

Otherwise, we save (ourselves and others) a bit of time.
Martin



More information about the R-devel mailing list