[Rd] rowSums()/colSums() don't preserve the 'integer' storage mode

Henrik Bengtsson hb at stat.berkeley.edu
Fri Jul 18 03:42:22 CEST 2008


Hi,

there is one more thing to consider: the risk of getting integer
overflow.  Should that be ignored?  How is this handled by sum()?

I think it is good if {col|row}Sums() would return the same data type
as the input object.

Cheers

Henrik

On Thu, Jul 17, 2008 at 6:32 PM, Herve Pages <hpages at fhcrc.org> wrote:
> Hi Bill,
>
> Bill.Venables at csiro.au wrote:
>>
>> I don't see the cost of doing so paying off.
>
> The cost seems rather small. For the "if" block that is tagged /* columns */
> in the do_colsum() function (src/main/array.c), that would give something
> like:
>
>    if (OP == 0 || OP == 1) { /* columns */
>        int ans_type = (type == REALSXP || OP == 1) ? REALSXP : INTSXP;
>        int ina = type == INTSXP ? NA_INTEGER : NA_LOGICAL;
>        PROTECT(ans = allocVector(ans_type, p));
>        for (j = 0; j < p; j++) {
>            double rsum = 0.0;
>            int isum = 0;
>            switch (type) {
>            case REALSXP:
>                rx = REAL(x) + n*j;
>                for (i = cnt = 0; i < n; i++, rx++) {
>                    if (!ISNAN(*rx)) {cnt++; rsum += *rx;}
>                    else if (keepNA) {rsum = NA_REAL; break;}
>                }
>                if (OP == 1) {
>                    if (cnt == 0 || ISNAN(rsum))
>                        REAL(ans)[j] = NA_REAL;
>                    else
>                        REAL(ans)[j] = rsum / cnt;
>                } else
>                    REAL(ans)[j] = rsum;
>            break;
>            case INTSXP: case LGLSXP:
>                ix = (type == INTSXP ? INTEGER(x) : LOGICAL(x)) + n*j;
>                for (i = cnt = 0; i < n; i++, ix++) {
>                    if (*ix != ina) {cnt++; isum += *ix;}
>                    else if (keepNA) {isum = NA_INTEGER; break;}
>                }
>                if (OP == 1) {
>                    if (cnt == 0 || isum == NA_INTEGER)
>                        REAL(ans)[j] = NA_REAL;
>                    else
>                        REAL(ans)[j] = ((double) isum) / cnt;
>                } else
>                    INTEGER(ans)[j] = isum;
>            break;
>            default:
>                /* we checked the type above, but be sure */
>                UNIMPLEMENTED_TYPEt("do_colsum", type);
>            }
>        }
>    }
>
> So now you have 42 lines instead of 37 in the current code.
> Then do something similar for the "if" block tagged /* rows */.
> Basically the new code would not be more complicated than the
> current code.
>
> In the end rowSums()/colSums() will do the _right_ thing i.e.
> they'll preserve the 'integer' storage mode, and, by doing so, will
> behave consistently with the sum() function.
>
> BTW since the LGLSXP type is supported then the "'x' must be numeric"
> error msg at the beginning of the do_colsum() function is inappropriate.
>
> Cheers,
> H.
>
>
>>
>> storage.mode is really only important if you are passing arguments to
>> compiled code.
>>
>> If you are passing to compiled code, you really need to ensure the
>> storage mode is what you think it is, anyway.
>>
>> Bill Venables.
>>
>>
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Herve Pages
>> Sent: Thursday, 17 July 2008 3:48 PM
>> To: R-devel at r-project.org
>> Subject: [Rd] rowSums()/colSums() don't preserve the 'integer' storage
>> mode
>>
>> Hi,
>>
>> Wouldn't that make sense to have rowSums()/colSums() to preserve the
>> storage mode?
>>
>> m <- matrix(1:15, nrow=5)
>>
>>  > storage.mode(m)
>> [1] "integer"
>>
>>  > storage.mode(sum(m))
>> [1] "integer"
>>
>>  > storage.mode(rowSums(m))
>> [1] "double" <------------------- surprising!
>>
>> Cheers,
>> H.
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list