[Rd] Question about quantile fuzz and GPL license

GILLIBERT, Andre Andre@G||||bert @end|ng |rom chu-rouen@|r
Wed Sep 15 18:52:33 CEST 2021


Martin Maechler wrote:
> OTOH,  type=7 is the default, and I guess used in 99.9% of
> all uses of quantile, *and* does never use any fuzz ....

Indeed. This also implies that this default should be well-thought when creating a new implementation of the quantile() procedure for a new programming language or library.
Most of the time, users use the default procedure, and do not report the procedure used in the statistical analysis reports, scientific or non-scientific articles produced.
The differences between all quantiles procedures are minor, unless they are used in crazy scenarios such as a sample size of 2, or with probs=0.001 for a sample of size 1000.
But, standardization of procedures is desirable for analysis reproducibility, as well as teaching (see https://doi.org/10.1080/10691898.2006.11910589 ).

Hyndman and Fan wanted that software package standardize their definition, but to no avail:
See https://robjhyndman.com/hyndsight/sample-quantiles-20-years-later/

In the absence of standard, my personal advice would be to use the same default as a popular statistical software, such as R or SAS.

R, Julia and NumPy (python) uses type 7 as default.
Microsoft Excel and LibreOffice Calc use type 7 as default (although Excel versions >= 2010 have new procedures).
SAS uses type 3 as default, unless prob=0.50
Stata uses type 2 or type 6, depending on the procedure (https://data.princeton.edu/stata/markdown/quantiles.htm)

-- 
Sincerely
André GILLIBERT



More information about the R-devel mailing list