[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
maechler at stat.math.ethz.ch
Wed Aug 24 11:36:38 CEST 2016
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Tue, 23 Aug 2016 14:33:58 +0200 writes:
>>>>> Dirk Eddelbuettel <edd at debian.org>
>>>>> on Fri, 19 Aug 2016 11:40:05 -0500 writes:
>> It is the old story of defined behaviour and expected outcomes. Hard to
>> change now.
> yes... not impossible though... see below
>> So I would suggest you do something like this in your ~/.Rprofile:
R> smry <- function(...) summary(..., digits=6)
>> Min. 1st Qu. Median Mean 3rd Qu. Max.
>> 155555 155555 155555 155555 155555 155555
>> Maybe call it Summary() instead.
> yes, do use a different name. There other such functions, 'summarize()'.
> Simone wrote
>> I had raised the matter ten years ago, and I was told that the topic was
>> already very^3 old
>> there is some discussion on its origin and also a declaration of intents to
>> change the default behaviour, which, unfortunately, remained a declaration.
>> I agree that R could do better here, let's hope in less than ten years
>> though. ;-)
> and the 2006 thread he mentions is basically a similar question
> and a reply by me that I agreed to some extent that a change was
> desirable ... originally we had adhered to the S "standard"
> which became the S+ one and at that time I did still have access
> to a running instance of S-PLUS 6.2 where I had seen that
> Insightful (the company selling curating and selling S-PLUS)
> also had decided to change the ~15 year old S "standard"... and
> indeed I was implicitly *asking* for proposals of such a change,
> but I think I never saw a (careful) proposal.
> In the spirit of probably 99% of other "base R" code, a change
> should really *not* round __at all__ in the summary() methods,
> but *only* in the print() methods of such summary() results.
> OTOH, for back compatibility, if a user does use summary(.., digits=.)
> explicitly, these digits should be 'obeyed' of course.
> I think summary(<1-variable>) could easily, and relatively "back-compatibly"
> be changed in the above vain.
> One "real problem" is the wrong decision (also from S and S-PLUS
> times IIRC) to return a "character" matrix for
> summary(<data.frame>, ..)
> or summary(<matrix>, ..)
> (For a data frame, I think it should return a list() of
> single-variable summary()es, or then a numeric matrix .. in
> both cases have a good print() method)
> because when you return a character matrix, all the numbers are
> already rounded, ... and if we follow the above approach they
> would have to be rounded further... ``the horror''
> I wonder how much code out there is relying on the internal
> structure of summary(<data.frame>).. because that is the one
> part I'd definitely want to change, too.
[Talking to myself .. ;-)]
Yes, but that's the tough part to change.
This thread's topic is really only about changing summary.default(),
and I have started testing such a change now, and that does seem
- No rounding in summary.default(), but
- (almost) back-compatible rounding in its print() method.
My current plan is to commit this to R-devel in a day or so,
unless unforeseen issues emerge.
More information about the R-devel