[R] Adding 95% contours around scatterplot points with ggplot2

Ista Zahn istazahn at gmail.com
Mon Jan 28 21:59:12 CET 2013


Hi Nate,

I infer from the stat_density2d documentation that the calculation is
carried out by the kde2d function in the MASS package. Refer to ?kde2d
for details.

Best,
Ista

On Mon, Jan 28, 2013 at 3:56 PM, Nathan Miller <natemiller77 at gmail.com> wrote:
> Hi Ista,
>
> Thanks. That does look pretty nice and I hadn't realized that was possible.
> Do you know how to extract information regarding those curves? I'd like to
> be able to report something about what portion of the data they encompass or
> really any other feature about them in a figure legend. I'll look into
> stat_density2d and see if I can determine how they are set.
>
> Thanks for your help,
>
> Nate
>
>
> On Mon, Jan 28, 2013 at 12:37 PM, Ista Zahn <istazahn at gmail.com> wrote:
>>
>> Hi Nate,
>>
>> You can make it less busy using the bins argument. This is not
>> documented, except in the examples to stat_contour, but try
>>
>> ggplot(data=data, aes(x, y, colour=(factor(level)), fill=level))+
>>         geom_point()+
>>         stat_density2d(bins=2)
>>
>> HTH,
>> Ista
>>
>> On Mon, Jan 28, 2013 at 2:43 PM, Nathan Miller <natemiller77 at gmail.com>
>> wrote:
>> > Thanks Ista,
>> >
>> > I have played a bit with stat_density2d as well. It doesn't completely
>> > capture what I am looking for and ends up being quite busy at the same
>> > time.
>> > I'm looking for a way of helping those looking that the figure to see
>> > the
>> > broad patterns of where in the x/y space the data from different groups
>> > are
>> > distributed. Using the 95% CI type idea is so that I don't end up
>> > arbitrarily drawing circles around each set of points. I appreciate your
>> > direction though.
>> >
>> > Nate
>> >
>> >
>> > On Mon, Jan 28, 2013 at 10:50 AM, Ista Zahn <istazahn at gmail.com> wrote:
>> >>
>> >> Hi Nathan,
>> >>
>> >> This only fits some of your criteria, but have you looked at
>> >> ?stat_density2d?
>> >>
>> >> Best,
>> >> Ista
>> >>
>> >> On Mon, Jan 28, 2013 at 12:53 PM, Nathan Miller
>> >> <natemiller77 at gmail.com>
>> >> wrote:
>> >> > Hi all,
>> >> >
>> >> > I have been looking for means of add a contour around some points in
>> >> > a
>> >> > scatterplot as a means of representing the center of density for of
>> >> > the
>> >> > data. I'm imagining something like a 95% confidence estimate drawn
>> >> > around
>> >> > the data.
>> >> >
>> >> > So far I have found some code for drawing polygons around the data.
>> >> > These
>> >> > look nice, but in some cases the polygons are strongly influenced by
>> >> > outlying points. Does anyone have a thought on how to draw a contour
>> >> > which
>> >> > is more along the lines of a 95% confidence space?
>> >> >
>> >> > I have provided a working example below to illustrate the drawing of
>> >> > the
>> >> > polygons. As I said I would rather have three "ovals"/95% contours
>> >> > drawn
>> >> > around the points by "level" to capture the different density
>> >> > distributions
>> >> > without the visualization being heavily influenced by outliers.
>> >> >
>> >> > I have looked into the code provided here from Hadley
>> >> >
>> >> > https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/85q4SQ9q3V8
>> >> > using the mvtnorm package and the dmvnorm function, but haven't been
>> >> > able
>> >> > to get it work for my data example. The calculated densities are
>> >> > always
>> >> > zero (at this step of Hadley's code: dgrid$dens <-
>> >> > dmvnorm(as.matrix(dgrid), ex_mu, ex_sigma)   )
>> >> >
>> >> > I appreciate any assistance.
>> >> >
>> >> > Thanks,
>> >> > Nate
>> >> >
>> >> > x<-c(seq(0.15,0.4,length.out=30),seq(0.2,0.6,length.out=30),
>> >> > seq(0.4,0.6,length.out=30))
>> >> >
>> >> >
>> >> > y<-c(0.55,x[1:29]+0.2*rnorm(29,0.4,0.3),x[31:60]*rnorm(30,0.3,0.1),x[61:90]*rnorm(30,0.4,0.25))
>> >> > data<-data.frame(level=c(rep(1, 30),rep(2,30), rep(3,30)), x=x,y=y)
>> >> >
>> >> >
>> >> > find_hull <- function(data) data[chull(data$x, data$y), ]
>> >> > hulls <- ddply(data, .(level), find_hull)
>> >> >
>> >> > fig1 <- ggplot(data=data, aes(x, y, colour=(factor(level)),
>> >> > fill=level))+geom_point()
>> >> > fig1 <- fig1 + geom_polygon(data=hulls, alpha=.2)
>> >> > fig1
>> >> >
>> >> >         [[alternative HTML version deleted]]
>> >> >
>> >> > ______________________________________________
>> >> > R-help at r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>
>



More information about the R-help mailing list