[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
maechler at stat.math.ethz.ch
Tue Aug 23 14:33:58 CEST 2016
>>>>> Dirk Eddelbuettel <edd at debian.org>
>>>>> on Fri, 19 Aug 2016 11:40:05 -0500 writes:
> It is the old story of defined behaviour and expected outcomes. Hard to
> change now.
yes... not impossible though... see below
> So I would suggest you do something like this in your ~/.Rprofile:
R> smry <- function(...) summary(..., digits=6)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 155555 155555 155555 155555 155555 155555
> Maybe call it Summary() instead.
yes, do use a different name. There other such functions, 'summarize()'.
> I had raised the matter ten years ago, and I was told that the topic was
> already very^3 old
> there is some discussion on its origin and also a declaration of intents to
> change the default behaviour, which, unfortunately, remained a declaration.
> I agree that R could do better here, let's hope in less than ten years
> though. ;-)
and the 2006 thread he mentions is basically a similar question
and a reply by me that I agreed to some extent that a change was
desirable ... originally we had adhered to the S "standard"
which became the S+ one and at that time I did still have access
to a running instance of S-PLUS 6.2 where I had seen that
Insightful (the company selling curating and selling S-PLUS)
also had decided to change the ~15 year old S "standard"... and
indeed I was implicitly *asking* for proposals of such a change,
but I think I never saw a (careful) proposal.
In the spirit of probably 99% of other "base R" code, a change
should really *not* round __at all__ in the summary() methods,
but *only* in the print() methods of such summary() results.
OTOH, for back compatibility, if a user does use summary(.., digits=.)
explicitly, these digits should be 'obeyed' of course.
I think summary(<1-variable>) could easily, and relatively "back-compatibly"
be changed in the above vain.
One "real problem" is the wrong decision (also from S and S-PLUS
times IIRC) to return a "character" matrix for
or summary(<matrix>, ..)
(For a data frame, I think it should return a list() of
single-variable summary()es, or then a numeric matrix .. in
both cases have a good print() method)
because when you return a character matrix, all the numbers are
already rounded, ... and if we follow the above approach they
would have to be rounded further... ``the horror''
I wonder how much code out there is relying on the internal
structure of summary(<data.frame>).. because that is the one
part I'd definitely want to change, too.
More information about the R-devel