[R] "continuous" boxplot?

hadley wickham h.wickham at gmail.com
Tue Oct 2 21:25:15 CEST 2007


A couple of other nice references for dealing with many points in a
scatterplot are:

D. B. Carr, R. J. Littlefield, W. L. Nicholson, and J. S. Littlefield.
Scatterplot matrix techniques for large n. Journal of the American
Statistical Association, 82(398):424–436, 1987.

W. S. Cleveland and R. McGill. The many faces of a scatterplot.
Journal of the American Statistical Association, 79(388):807–822,
1984.

A. Unwin, M. Theus, and H. Hofmann. Graphics of Large Datasets. Springer, 2006.

Another technique is to overlay contours of a 2d kernel density
estimate - this is somewhat similar to a bag-plot, although with
different underlying assumptions.

It's also important to think about whether you're interested in the
joint (e.g. bag plot) or conditional (e.g. quantile regression)
density.

Hadley


On 10/2/07, Bert Gunter <gunter.berton at gene.com> wrote:
>
>
> Folks:
>
> I found the references in the previous replies to this vexing data
> visualization issue to be quite interesting and useful. I think it fair to
> say that there is no single "best" way to do this -- it all depends on what
> you need to learn , and probably several alternative displays will be
> necessary to get the important information the data have to convey.
> However,as always, this issue has been considered before, and it may be
> worthwhile to at least consider an already available "standard" approach"
> using shingles and a trellis-type plot. ?xyplot and ?shingle should get you
> started (you probably want to shingle or bin on quantiles of y). The
> canonical reference is Bill Cleveland's VISUALIZING DATA (see "coplots").
>
>
> Bert Gunter
> Genentech Nonclinical Statistics
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Jim Porzak
> Sent: Tuesday, October 02, 2007 11:19 AM
> To: Karin Lagesen
> Cc: r-help at r-project.org
> Subject: Re: [R] "continuous" boxplot?
>
> Karin,
>
> I like to use bagplots in these cases where there are a lot of cases and
> scatter plots become one big smudge.
>
> See
> http://www.wiwi.uni-bielefeld.de/~wolf/software/R-wtools/bagplot/bagplot.pdf
>
> And some further examples on slides 36 - 39 of
> http://www.porzak.com/JimArchive/JimPorzak_CIwithR_useR2006_tutorial.pdf
>
> --
> HTH,
> Jim Porzak
> Responsys, Inc.
> San Francisco, CA
> http://www.linkedin.com/in/jimporzak
>
> On 10/1/07, Karin Lagesen <karin.lagesen at medisin.uio.no> wrote:
> >
> >
> >
> > I have two vectors x and y, which I would like to plot against each
> > other. I am also displaying other data in this plot. However, I have
> > about 1 million points to plot, and just plotting them x againt y is
> > not very informative. What I'd like to do is to do sort of a
> > continuous box plot.
> >
> > My x values goes from -1 to 1 and my y values from 0 to 1, so I4d like
> > to plot the median and quantiles, and possibly also all of the
> > outliers somehow. Are there any facilities in R for doing something
> > like this, or would I need to do this the hard coded way?
> >
> > Thankyou very much for your help!
> >
> > Karin
> > --
> > Karin Lagesen, PhD student
> > karin.lagesen at medisin.uio.no
> > http://folk.uio.no/karinlag
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/



More information about the R-help mailing list