[Rd] Question about quantile fuzz and GPL license
    Martin Maechler 
    m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
       
    Wed Sep 15 10:46:45 CEST 2021
    
    
  
>>>>> GILLIBERT, Andre 
>>>>>     on Tue, 14 Sep 2021 16:13:05 +0000 writes:
    > On 9/14/21 9:22 AM, Abel AOUN wrote:
    >> However I don't get why epsilon is multiplied by 4 instead of simply using epsilon.
    >> Is there someone who can explain this 4 ?
    > .Machine$double.eps is the "precision" of floating point values for values close to 1.0 (between 0.5 and 2.0).
    > Using fuzz = .Machine$double.eps would have no effect if nppm is greater than or equal to 2.
    > Using fuzz = 4 * .Machine$double.eps can fix rounding errors for nppm < 8; for greater nppm, it has no effect.
    > Indeed:
    > 2 + .Machine$double.eps == 2
    > 8+ 4*.Machine$double.eps == 8
    > Since nppm is approximatively equal to the quantile multiplied by the sample size, it can be much greater than 2 or 8.
hmm: not "quantile":
 it is approximatively equal to the *'prob'* multiplied by the sample size
 {the quantiles themselves can be on any scale anyway, but they
  don't matter yet fortunately in these parts of the calculations}
but you're right in the main point that they are
approx. proportional to  n.
    > Maybe the rounding errors are only problematic for small nppm; or only that case is taken in account.
    > Moreover, if rounding errors are cumulative, they can be much greater than the precision of the floating point value. I do not know how this constant was chosen and what the use-cases were.
I vaguely remember I've been wondering about this also (back at the time).
Experiential wisdom would tell us to take such  fuzz values as
*relative* to the magnitude of the values they are added to,
here 'nppm' (which is always >= 0, hence no need for  abs(.) as usually).
So, instead of
    j <- floor(nppm + fuzz)
    h <- nppm - j
    if(any(sml <- abs(h) < fuzz, na.rm = TRUE)) h[sml] <- 0
it would be (something like)
    j <- floor(nppm*(1 + fuzz))
    h <- nppm - j
    if(any(sml <- abs(h) < fuzz*nppm, na.rm = TRUE)) h[sml] <- 0
or rather we would define fuzz as
   nppm * (k * .Machine$double.eps) 
for a small k.
- - -
OTOH,  type=7 is the default, and I guess used in 99.9% of
all uses of quantile, *and* does never use any fuzz ....
Martin
    > --
    > Sincerely
    > Andre GILLIBERT
    > [[alternative HTML version deleted]]
    
    
More information about the R-devel
mailing list