[R] Percentile function

ralf.herold@charite.de ralf.herold at charite.de
Tue Oct 17 18:35:44 CEST 2000


> observation at 0% and the largest at 100% and the value at observation
> point i being (i-1)*100%/(N-1). This is not quite the definition you
> have and often not what people expect...

I recall (and have looked up) that SAS offers five ways to compute
percentiles, which are in short, 1. weighted average, 2. closest observation
number, 3. empirical distribution function, 4. "aimed" weighted average, and
5. empirical distribution function with averaging. The last is the default.
We once choose a different option, for a certain question. Here is a quick
translation into R, could this be interesting for anyone?

#################################################
# percentile computation functions, adapted from
# SAS Procedures Guide, Version 6, 3rd ed., 625f.
#
perc <- function (x, p = c(5,10,25,50,75,95), pctldef = 5) {
  #
  # initialization lacking error checking
  #
  x <- na.omit (as.vector (x)) # transform input into non-empty vector
  p <- na.omit (as.vector (p)) # wanted percentiles into non-empty vector
  x <- sort    (x)             # sort it in ascending order
  n <- length  (x)             # length of non-empty vector
  #
  if (pctldef == 4) n <- n + 1 # increase n by one in case pctldef == 4
  #
  j <- trunc   (n * p / 100)   # j is the integer part of the product n * p
  g <- - j   + (n * p / 100)   # g is the fractional part of the product n *
p
  #
  # the different computational procedures follow
  #
  if (pctldef == 1)            # weighted average at x\sub{np}
                               # x\sub{0} is taken to be x\sub{1}, cf. above
    perc <- (1 - g) * x [j] + g * x [j + 1]
  #
  if (pctldef == 2)  {         # observation number closest to np
    i <- ifelse (g == 0.5,
		 ifelse (trunc (j / 2) == j / 2, j, j + 1),
		 trunc  (n * p / 100 + 0.5))
    perc <- x [i]}
  #
  if (pctldef == 3)            # empirical distribution function
    perc <- x [ifelse (g == 0, j, j + 1)]
  #
  if (pctldef == 4)            # weighted average aimed at x\sub{p*(n+1)}
                               # x\sub{n+1} is taken to be x\sub{n}, cf.
above
    perc <- (1 - g) * x [j] + g * x [j + 1]
  #
  if (pctldef == 5)            # empirical distribution function with
averaging
    perc <- ifelse (g == 0,
		   (x [j] + x [j + 1]) / 2,
		    x [j + 1])
  #
  names  (perc) <- p
  return (perc)
}
# Ralf Herold, ralf herold at charite.de, 17.10.2000 18:42:04
############################################################################
###

# for testing:
test.x <- c (1,2,3,4,5,6,7,8,9,4,2,3,6,7,8,2,5,2)
test.p <- c (26, 51)
for (i in 1:5) print (perc (test.x, test.p, i))
# end.


---- Dr. med. Ralf Herold
| Koordinationszentrale Kompetenznetz Pädiatrische Onkologie und Hämatologie
| Charité Campus Virchow-Klinikum Medizinische Fakultät Humboldt-Universität
| D-13353 Berlin, Augustenburger Platz 1, Raum 4.3412 4. Etage Mittelallee 8
| Tel. +49(30)450-66834 Fax -66906 Sprach-/Faxbox +49(180)505254-873936
| mailto:ralf.herold at charite.de  http://www.knm-poh.charite.de/

> -----Original Message-----
> From: owner-r-help at stat.math.ethz.ch
> [mailto:owner-r-help at stat.math.ethz.ch]On Behalf Of Peter Dalgaard BSA
> Sent: Tuesday, October 17, 2000 2:40 PM
> To: salwin at mitretek.org
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Percentile function

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list