[Rd] quantile() names

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Dec 16 11:13:32 CET 2020


>>>>> Gabriel Becker 
>>>>>     on Mon, 14 Dec 2020 13:23:00 -0800 writes:

    > Hi Edgar, I certainly don't think quantile(x, .975) should
    > return 980, as that is a completely wrong answer.

    > I do agree that it seems like the name is a bit
    > offputting. I'm not sure how deep in the machinery you'd
    > have to go to get digits to no effect on the names (I
    > don't have time to dig in right this second).

    > On the other hand, though, if we're going to make the
    > names not respect digits entirely, what do we do when
    > someone does quantile(x, 1/3)? That'd be a bad time had by
    > all without digits coming to the rescue, i think.

    > Best, ~G

and now we read more replies on this topic without anyone looking at
the pure R source code which is pretty simple and easy.
Instead, people do experiments and take time to muse about their findings..

Honestly, I'm disappointed: I've always thought that if you
*write* on R-devel, you should be able to figure out a few
things yourself before that..

It's not rocket science to see/know that you need to quickly look at
the quantile.default() method function and then to note 
that it's  format_perc(.) which is used to create the names.

Almost surely, I've been a bit envolved in creating parts of
this and probably am responsible for the current default
behavior.

	   ....
	   ....(sounds of digging) ...
	   ....
	   ....
	   ....
	   ....
	   ....
	   ....

--> Yes:

------------------------------------------------------------------------
r837 | maechler | 1998-03-05 12:20:37 +0100 (Thu, 05. Mar 1998) | 2 Zeilen
Geänderte Pfade:
   M /trunk/src/library/base/R/quantile
   M /trunk/src/library/base/man/quantile.Rd

fixed names(.) construction
------------------------------------------------------------------------

With this diff  (my 'svn-diffB -c837 quantile') :
Index: quantile
===================================================================
21c21,23
< 	names(qs) <- paste(round(100 * probs), "%", sep = "")
---
> 	names(qs) <- paste(formatC(100 * probs, format= "fg", wid=1,
> 				   dig= max(2,.Options$digits)),
> 			   "%", sep = "")

-----------------------------------------------------------------
so this was before this was modularized into the format_perc()
utility and quite a while before R 1.0.0 ....

Now, 22.8 years later, I do think that indeed it was not
necessarily the best idea to make the names() construction depend  on the
'digits' option entirely and just protect it by using at least 2 digits.

What I think is better is to

1) provide an optional argument   'digits = 7'
   back compatible w/ default getOption("digits")

2) when used, check that it is at least '1'

But then some scripts / examples of some people *will* change
..., e.g., because they preferred to have a global setting of digits=5

so I'm guessing it may make more people unhappy than other
people happy if we change this now, after close to 23 years  .. ??

Martin

--
Martin Maechler
ETH Zurich  and  R Core team


    > On Mon, Dec 14, 2020 at 11:55 AM Merkle, Edgar
    > C. <merklee using missouri.edu> wrote:

    >> All,
    >> 
    >> Consider the code below
    >> 
    >> options(digits=2)
    >>  x <- 1:1000 
    >> quantile(x, .975)

    >> The value returned is 975 (the 97.5th percentile), but
    >> the name has been shortened to "98%" due to the digits
    >> option. Is this intended? I would have expected the name
    >> to also be "97.5%" here. Alternatively, the returned
    >> value might be 980 in order to match the name of "98%".
    >> 
    >> Best, Ed
    >>



More information about the R-devel mailing list