[R] Feature request: add boxplot()s to current plot (given x[i])

David James dj at research.bell-labs.com
Fri Dec 10 17:58:52 CET 1999


Hi,

I experimented with a set of S functions to do  "generalized" boxplots 
sometime ago (e.g., "vase" or "violin, "diamond" plots, etc).  There's 
code to draw these gboxplots at arbitrary positions. Please take a look
at the help below and let me know if you'd like either  to port
it R or scavange some of the code.

David A James                        Phone: (908) 582-3082
Bell Labs, Lucent Technologies       Fax:   (908) 582-3340
600 Mountain Ave                     Email: dj at bell-labs.com
Murray Hill, NJ 07974
--------------------------------------------------------------------
Generalized Box Plots

USAGE:
       gboxplot(..., type = "box", range.=,
               width=, varwidth=F, notch=F, names.=, horiz = T,
               fill=F, col=1, old = T, plot.it=TRUE)

       gboxplot(..., type = "vase", from=, to=, kernel.width=, n=,
               width=, varwidth=F, notch=F, names.=, horiz = T,
               fill=F, col=1, plot.it=TRUE)

       gboxplot(..., type = "diamond", width=, varwidth=F,
            names.=, horiz = T, fill=F, col=1, pch, plot.it=TRUE)

       gboxplot(..., type = "pts", jitter.pts = F,
               width=, varwidth=F, notch=F, names.=, horiz = T,
               fill=F, col=1, pch, plot.it=TRUE)
ARGUMENTS:
...:    vectors  or  a list containing a number of numeric compo-
       nents (e.g., the output of split').  Missing values  (NAs)
       are allowed.
type=:    character string (the first letter suffices) specifying
       type of gboxplots, currently "box" for  Tukey's  boxplots,
       "vase"  for vase or violin plots (see Benjamini (1988) and
       Hintze and Nelson (1998)),  "diamond"  for  diamond  plots
       (see,  for  instance, JMP (1995)), or "pts" for one dimen-
       sional histograms (See Chambers et. al. (1983)).  Type may
       also  be the name of a user-written function that computes
       an object for which there exists a draw' method.  For  in-
       stance  type  = cmp.vase' specifies the function that com-
       putes vases and which returns an object of  class  "vase";
       the method draw.vase' (automatically called by the generic
       draw') plots the vases.
range.=:    controls the strategy for the whiskers  and  the  de-
       tached  points  beyond  the whiskers. By default, whiskers
       are drawn to the nearest value not beyond a standard range
       from  the quartiles; points beyond are drawn individually.
       Giving range.=0' forces whiskers to the full  data  range.
       Any  positive  value  of  range.'  multiplies the standard
       range by this amount.  The standard range  is  1.5*(inter-
       quartile range).
width=:    vector of relative box widths.  See also argument var-
       width'.
varwidth=:     if TRUE', box widths will be proportional  to  the
       square-root of the number of observations for the box.
notch=:    if  TRUE', notched boxes are drawn, where non-overlap-
       ping of notches of boxes indicates a difference at a rough
       5% significance level.
names.=:     optional  character  vector of names for the groups.
       If omitted, names used in labeling the plot will be  taken
       from  the  names  of  the arguments and from the names at-
       tribute of lists.
plot.it=:   if TRUE', the box plot will be  produced;  otherwise,
       the calculated summaries of the arguments are returned.
old=:      if  TRUE',  the plot will be produced in the style de-
       scribed in the Tukey (1977) reference; otherwise, the plot
       will  follow  the  more  modern  style introduced in Tukey
       (1990), where the advantages of  the  new  style  are  de-
       scribed.
horiz=:   if TRUE boxes are drawn horizontally.
from:      (vaseplots) lower bound for the percent of data to use
       in fitting density'.  By default 0.25.
to:    (vaseplots) upper bound for the percent of data to use  in
       fitting density'.  By default 0.75.
kernel.width:     (vaseplots)  width of the kernel window, as de-
       fined in the function density'.  Its  default  corresponds
       to  the  width  of  a histogram bar as computed by Doane's
       rule.
n:     (vaseplot) number of equally spaced density estimates. De-
       fault is 25.
jitter.pts:     (pts)  if logical, it specifies whether or not to
       jitter the points inside each group. If numeric it  speci-
       fies  the  amount,  in  data  units, to jitter. Default is
       FALSE.
fill:     should boxes or vases be filled? Default is FALSE.
col:   vector of colors for each group.
pch:   vector of plotting character for each group.

       Graphical parameters may also be supplied as arguments  to
       this function (see par)
VALUE:
       If  plot.it'  is  FALSE',  the  value is a list as long as
       there are data vectors with the components  listed  below.
       Otherwise the generic function draw' is invoked with these
       components, plus optional width', varwidth' and notch', to
       produce  the  plot.   Note  that  draw'  returns a list of
       box/vase centers.

stats:    vector giving the upper extreme, upper quartile,  medi-
       an, lower quartile, and lower extreme for each box.
n:     the number of observations in each group.
conf:     vector giving confidence limits for the median.
out:   vector of outlying points.
names.:   names for each box (see argument names.' above).
dnsty:     a list with x' and y' components as output by density.

NOTES:
       In the case of vase plots, the density is estimated  using
       kernel  smoothing  ---  this  was done for expediency, but
       other density estimates may be easily added  (e.g.,  local
       polynomial  fitting  as in Loader (1996)).  Also note that
       the aspect ratio of the density traces may be such that it
       distorts important features of the data.

REFERENCE:
       Tukey, John W., Exploratory Data Analysis, Addison-Wesley,
       Reading, Mass., 1977.

       Tukey, John W., "Data Based Graphics:  Visual  Display  in
       the  Decades  to  Come", Statistical Science, pp. 327-339,
       1990.

       Benjamini, Yoav "Opening the Box of a Boxplot" The  Ameri-
       can Statistician, pp. 257-262, 1988.

       Chamber, J. M., Cleveland, W. S., Klein, P., and Tukey, P.
       A.  Graphical Methods for Data Analysis, Wadsworth, Pacif-
       ic Grove, CA., 1983.

       Hintze, Jerry L., and Nelson, Ray D., "Violin Plots: A Box
       Plot-Density Trace Synergism", The American  Statistician,
       pp. 181-184, 1998.

       Loader,  Clive, "Local Likelihood Density Estimation", An-
       nals of Statistics, 1996.

       "JMP User's Guide", SAS Institute Inc., 1995.

EXAMPLES:
       gboxplot(group1,group2,group3)

       gboxplot(split(salary,age),varwidth=TRUE,notch=TRUE)

       # the example plot is produced by:
       gboxplot(
             split(lottery.payoff,lottery.number%/%100),
             main=lottery.label,
             sub="Leading Digit of Winning Numbers",
             ylab="Payoff")
       gboxplot( split(Mileage, Type), type = "vase", col=1:6)
       gboxplot( split(Mileage, Type), type = "pts", jitter=3, col=1:6)


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list