[R] quantile() depends on order of probs?

Christos Argyropoulos argchris at hotmail.com
Sat Jun 19 13:35:06 CEST 2010



Hi, 
It seems to me that the results are actually the same but they are not returned in the same order (R 2.10.1 in Windows Vista). If you call sort on the output the results will be the same:
> sort(quantile(c(54, 72, 83, 112), type=6, probs=c(0, .25, .5, .75, 1)))
    0%    25%    50%    75%   100% 
 54.00  58.50  77.50 104.75 112.00 
> sort(quantile(c(54, 72, 83, 112), type=6, probs=c(.25, .5, .75, 1, 0)))
    0%    25%    50%    75%   100% 
 54.00  58.50  77.50 104.75 112.00 

With such a small sample, the actual quantile values may critically depend on the interpolatory algorithm used in their calculation, so exercise caution:

> sort(quantile(c(54, 72, 83, 112), type=7, probs=c(0, .25, .5, .75, 1)))
    0%    25%    50%    75%   100% 
 54.00  67.50  77.50  90.25 112.00 
> sort(quantile(c(54, 72, 83, 112), type=7, probs=c(.25, .5, .75, 1, 0)))
    0%    25%    50%    75%   100% 
 54.00  67.50  77.50  90.25 112.00 

Christos Argyropoulos


----------------------------------------
> Date: Fri, 18 Jun 2010 21:02:41 -0700
> From: jwiley.psych at gmail.com
> To: r-help at r-project.org
> Subject: [R] quantile() depends on order of probs?
>
> Hello All,
>
> I am trying to figure out the rational behind why quantile() returns
> different values for the same probabilities depending on whether 0 is
> first.
>
> Here is an example:
>
> quantile(c(54, 72, 83, 112), type=6, probs=c(0, .25, .5, .75, 1))
> quantile(c(54, 72, 83, 112), type=6, probs=c(.25, .5, .75, 1, 0))
>
> It seems to come down to this part of the code for quantile:
>
> fuzz <- 4 * .Machine$double.eps
> nppm <- a + probs * (n + 1 - a - b)
> j <- floor(nppm + fuzz)
> h <- nppm - j
> qs <- x[j + 2L]
> qs[h == 1] <- x[j + 3L][h == 1]
> other <- (h> 0) && (h < 1)
> if (any(other))
> qs[other] <- ((1 - h) * x[j + 2L] + h * x[j + 3L])[other]
>
> In my example, a and b are both 0, and n = 4. Particularly, the
> alternate formula for qs is only used when the first element of h is
> both> 0 and < 1. Any ideas on this? It seems like a simple
> alternative would be
>
> other <- (h> 0) & (h < 1)
>
> but I do not know if that would cause problems for other quantile
> formulae. By the way, this comes around lines 39-70 in
> quantile.default in:
>
>> version
> _
> platform x86_64-pc-mingw32
> arch x86_64
> os mingw32
> system x86_64, mingw32
> status
> major 2
> minor 11.1
> year 2010
> month 05
> day 31
> svn rev 52157
> language R
> version.string R version 2.11.1 (2010-05-31)
>
>
> Best regards,
>
> Josh
>
> --
> Joshua Wiley
> Ph.D. Student
> Health Psychology
> University of California, Los Angeles
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.



More information about the R-help mailing list