[Rd] There is pmin and pmax each taking na.rm, how about psum?

Matthew Dowle mdowle at mdowle.plus.com
Sun Nov 4 22:28:35 CET 2012


> On Sun, Nov 4, 2012 at 6:35 AM, Justin Talbot <jtalbot at stanford.edu>
> wrote:
>>>
>>> Then the case for psum is more for convenience and speed -vs-
>>> colSums(rbind(x,y), na.rm=TRUE)), since rbind will copy x and y into a
>>> new
>>> matrix. The case for pprod is similar, plus colProds doesn't exist.
>>>
>>
>> Right, and consistency; for what that's worth.
>>
>>>> Thus, + should have the signature: `+`(..., na.rm=FALSE), which would
>>>> allow you to do things like:
>>>>
>>>> `+`(c(1,2),c(1,2),c(1,2),NA, na.rm=TRUE) = c(3,6)
>>>>
>>>> If you don't like typing `+`, you could always alias psum to `+`.
>>>
>>> But there would be a cost, wouldn't there? `+` is a dyadic .Primitive.
>>> Changing that to take `...` and `na.rm` could slow it down (iiuc), and
>>> any
>>> changes to the existing language are risky.  For example :
>>>     `+`(1,2,3)
>>> is currently an error. Changing that to do something might have
>>> implications for some of the 4,000 packages (some might rely on that
>>> being
>>> an error), with a possible speed cost too.
>>>
>>
>> There would be a very slight performance cost for the current
>> interpreter. For the new bytecode compiler though there would be no
>> performance cost since the common binary form can be detected at
>> compile time and an optimized bytecode can be emitted for it.
>>
>> Taking what's currently an error and making it legal is a pretty safe
>> change; unless someone is currently relying on `+`(1,2,3) to return an
>> error, which I doubt. I think the bigger question on making this
>> change work would be on the S3 dispatch logic. I don't understand the
>> intricacies of S3 well enough to know if this change is plausible or
>> not.

Interesting. Sounds more possible than I thought.

>>
>>> In contrast, adding two functions that didn't exist before: psum and
>>> pprod,
>>> seems to be a safer and simpler proposition.
>>
>> Definitely easier. Leaves the language a bit more complicated, but
>> that might be the right trade off. I would strongly suggest adding
>> pany and pall as well. I find myself wishing for them all the time.
>> prange would be nice as well.
>
> Have a look at the matrixStats package; it might bring what you're looking
> for:
>
>   http://cran.r-project.org/web/packages/matrixStats
>
> /Henrik

Nice package and very handy. It has colProds, too. But its functions take
a matrix.

' Then the case for psum is more for convenience and speed
-vs-colSums(rbind(x,y), na.rm=TRUE)), since rbind will copy x and y into a
new matrix. '

Matthew



More information about the R-devel mailing list