[R] FW: Bubble plots

John Fox jfox at mcmaster.ca
Sat Aug 2 20:41:22 CEST 2008


Dear Cody, Frank, and Hadley,

Perhaps a more general point is that using a vectorized cex argument to
plot() or points(), one can specify the relative radii of circles.

Try, for example, plot(1:10, cex=sqrt(1:10)).

Regards,
 John

On Sat, 2 Aug 2008 08:24:33 -0500
 "hadley wickham" <h.wickham at gmail.com> wrote:
> On Sat, Aug 2, 2008 at 8:10 AM, Frank E Harrell Jr
> <f.harrell at vanderbilt.edu> wrote:
> > Cody Hamilton wrote:
> >>
> >> Is there a way to create a 'bubble plot' in R?
> >>
> >> For example, if we define the following data frame containing the
> level of
> >> y observed for 5 patients at three time points:
> >>
> >> time<-c(rep('time 1',5),rep('time 2',5),rep('time 3',5))
> >> y<-c('a','b','c','d','a','b','c','a','d','a','a','a','b','c','d')
> >> D<-data.frame(cbind(y,time))
> >>
> >> I would like to display the percentage of subjects in each level
> of y at
> >> each time point as a bubble whose size is proportional to the
> percentage of
> >> subjects in the given level of y at the given time point.  Thus,
> in the case
> >> of the data frame above the plot would have the levels of y
> >> ('a','b','c','d') on the y-axis and the levels of time ('time
> 1','time 2',
> >> time 3') on the x-axis with four bubbles above each time point
> (e.g. the
> >> size of the bubble in the bottom left corner of the plot would be
> >> proportional to the percentage of patients with y='a' at
> time='time 1').
> >>
> >> I am running R 2.7.1 under windows.
> >>
> >> Regards,
> >>   -Cody
> >>
> >
> > The xYplot function in the Hmisc package can do that.  It may be
> more
> > elegant using ggplot2.
> 
> It's certainly possible to do it with ggplot2:
> 
> tab <- prop.table(table(D), margin = 2)
> df <- as.data.frame(tab, responseName = "freq")
> 
> library(ggplot2)
> qplot(y, time, data = df, size = freq)
> qplot(y, time, data = df, size = freq) + scale_area()
> qplot(y, time, data = df, size = freq) + scale_area(to=c(1,5))
> 
> But it wouldn't recommend it - you're trying to visualise an
> important
> number (frequency) using a perceptual mapping (size) that humans
> aren't very good at.  Why not do a scatterplot of frequency vs time?
> 
> qplot(time, freq, data=df, colour = y)
> 
> There are only a few different values of freq for this example, so a
> little jittering helps:
> 
> qplot(time, freq, data=df, colour = y, geom="jitter")
> 
> Since you have time on the x-axis it's common to use a line plot:
> 
> df$time <- as.numeric(gsub("time ", "", df$time))
> qplot(time, freq, data=df, colour = y, geom="line")
> 
> although again you have an overplotting problem, which you could
> solve
> with jittering:
> 
> qplot(time, freq, data=df, colour = y, geom="line",
> position="jitter")
> 
> Hadley
> 
> -- 
> http://had.co.nz/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/



More information about the R-help mailing list