[R] Adding 95% contours around scatterplot points with ggplot2
Ista Zahn
istazahn at gmail.com
Mon Jan 28 21:37:46 CET 2013
Hi Nate,
You can make it less busy using the bins argument. This is not
documented, except in the examples to stat_contour, but try
ggplot(data=data, aes(x, y, colour=(factor(level)), fill=level))+
geom_point()+
stat_density2d(bins=2)
HTH,
Ista
On Mon, Jan 28, 2013 at 2:43 PM, Nathan Miller <natemiller77 at gmail.com> wrote:
> Thanks Ista,
>
> I have played a bit with stat_density2d as well. It doesn't completely
> capture what I am looking for and ends up being quite busy at the same time.
> I'm looking for a way of helping those looking that the figure to see the
> broad patterns of where in the x/y space the data from different groups are
> distributed. Using the 95% CI type idea is so that I don't end up
> arbitrarily drawing circles around each set of points. I appreciate your
> direction though.
>
> Nate
>
>
> On Mon, Jan 28, 2013 at 10:50 AM, Ista Zahn <istazahn at gmail.com> wrote:
>>
>> Hi Nathan,
>>
>> This only fits some of your criteria, but have you looked at
>> ?stat_density2d?
>>
>> Best,
>> Ista
>>
>> On Mon, Jan 28, 2013 at 12:53 PM, Nathan Miller <natemiller77 at gmail.com>
>> wrote:
>> > Hi all,
>> >
>> > I have been looking for means of add a contour around some points in a
>> > scatterplot as a means of representing the center of density for of the
>> > data. I'm imagining something like a 95% confidence estimate drawn
>> > around
>> > the data.
>> >
>> > So far I have found some code for drawing polygons around the data.
>> > These
>> > look nice, but in some cases the polygons are strongly influenced by
>> > outlying points. Does anyone have a thought on how to draw a contour
>> > which
>> > is more along the lines of a 95% confidence space?
>> >
>> > I have provided a working example below to illustrate the drawing of the
>> > polygons. As I said I would rather have three "ovals"/95% contours drawn
>> > around the points by "level" to capture the different density
>> > distributions
>> > without the visualization being heavily influenced by outliers.
>> >
>> > I have looked into the code provided here from Hadley
>> > https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/85q4SQ9q3V8
>> > using the mvtnorm package and the dmvnorm function, but haven't been
>> > able
>> > to get it work for my data example. The calculated densities are always
>> > zero (at this step of Hadley's code: dgrid$dens <-
>> > dmvnorm(as.matrix(dgrid), ex_mu, ex_sigma) )
>> >
>> > I appreciate any assistance.
>> >
>> > Thanks,
>> > Nate
>> >
>> > x<-c(seq(0.15,0.4,length.out=30),seq(0.2,0.6,length.out=30),
>> > seq(0.4,0.6,length.out=30))
>> >
>> > y<-c(0.55,x[1:29]+0.2*rnorm(29,0.4,0.3),x[31:60]*rnorm(30,0.3,0.1),x[61:90]*rnorm(30,0.4,0.25))
>> > data<-data.frame(level=c(rep(1, 30),rep(2,30), rep(3,30)), x=x,y=y)
>> >
>> >
>> > find_hull <- function(data) data[chull(data$x, data$y), ]
>> > hulls <- ddply(data, .(level), find_hull)
>> >
>> > fig1 <- ggplot(data=data, aes(x, y, colour=(factor(level)),
>> > fill=level))+geom_point()
>> > fig1 <- fig1 + geom_polygon(data=hulls, alpha=.2)
>> > fig1
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list