[R] Graph many points without hiding some

Greg Snow Greg.Snow at imail.org
Thu Mar 31 18:07:07 CEST 2011


Just a note, Base graphics does support transparency as long as the device plotting to supports it.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Dennis Murphy
> Sent: Thursday, March 31, 2011 1:36 AM
> To: Samuel Dennis
> Cc: R-help at r-project.org
> Subject: Re: [R] Graph many points without hiding some
> 
> Hi:
> 
> I can think of a couple: (1) size reduction of the points; (2) alpha
> transparency; (3)  (1) + (2)
> 
> >From your original plot in base graphics, I reduced cex to 0.2 and it
> didn't
> look too bad:
> 
> plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24), cex = 0.2)
> points(rnorm(x,mean=20),rnorm(x),col=1, cex = 0.2)
> points(rnorm(x,mean=21),rnorm(x),col=2, cex = 0.2)
> 
> AFAIK, base graphics doesn't have alpha transparency available, but the
> ggplot2 package does. One approach is to adjust the alpha transparency
> on
> default size points; another is to combine reduced point size with
> alpha
> transparency. Here is your example rehashed for ggplot2.
> 
> require(ggplot2)
> d <- data.frame(x1 = rnorm(10000, mean = 19), x2 = rnorm(10000, mean =
> 20),
>                 x3 = rnorm(10000, mean = 21), x = rnorm(10000))
> # Basically stacking x1 - x3, creating two new vars named variable and
> value
> dm <- melt(d, id = 'x')   # from reshape package, loads with ggplot2
> # Alpha transparency is set to a low level with default point size,
> # but the colors in the legend are muted by the level of transparency
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
>    geom_point(alpha = 0.05) +
>    scale_colour_manual(values = c('x1' = 'black',
>                                   'x2' = 'red', 'x3' = 'green'))
> 
> # A tradeoff is to reduce the point size and increase alpha a bit, but
> these
> changes will
> # also be reflected in the legend.
> 
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
>    geom_point(alpha = 0.15, size = 1) +
>    scale_colour_manual(values = c('x1' = 'black',
>                                   'x2' = 'red', 'x3' = 'green'))
> 
> You may well find the legend to be useless for this example, so to get
> rid
> of it,
> 
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
>    geom_point(alpha = 0.15, size = 1) +
>    scale_colour_manual(values = c('x1' = 'black',
>                                   'x2' = 'red', 'x3' = 'green')) +
>    opts(legend.position = 'none')
> 
> The nice thing about the ggplot2 graph is that you can adjust the point
> size
> and alpha transparency to your tastes. The default point size is 2 and
> the
> default alpha = 1 (no transparency).
> 
> HTH,
> Dennis
> 
> On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis <sjdennis3 at gmail.com>
> wrote:
> 
> > I have a very large dataset with three variables that I need to graph
> using
> > a scatterplot. However I find that the first variable gets masked by
> the
> > other two, so the graph looks entirely different depending on the
> order of
> > variables. Does anyone have any suggestions how to manage this?
> >
> > This code is an illustration of what I am dealing with:
> >
> > x <- 10000
> > plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> > points(rnorm(x,mean=19),rnorm(x),col=3)
> >
> > gives an entirely different looking graph to:
> >
> > x <- 10000
> > plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
> > points(rnorm(x,mean=20),rnorm(x),col=1)
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> >
> > despite being identical in all respects except for the order in which
> the
> > variables are plotted.
> >
> > I have tried using pch=".", however the colours are very difficult to
> > discern. I have experimented with a number of other symbols with no
> real
> > solution.
> >
> > The only way that appears to work is to iterate the plot with a for
> loop,
> > and progressively add a few numbers from each variable, as below.
> However
> > although I can do this simply with random numbers as I have done
> here, this
> > is an extremely cumbersome method to use with real datasets.
> >
> > plot(1,1,xlim=c(16,24),ylim=c(-4,4),col="white")
> > x <- 100
> > for (i in 1:100) {
> > points(rnorm(x,mean=19),rnorm(x),col=3)
> > points(rnorm(x,mean=20),rnorm(x),col=1)
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> > }
> >
> > Is there some function in R that could solve this through
> automatically
> > iterating my data as above, using transparent symbols, or something
> else?
> > Is
> > there some other way of solving this issue that I haven't thought of?
> >
> > Thankyou,
> >
> > Samuel Dennis
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list