[R] ggplot: restricting legend with multiple geoms and nested groups

Szumiloski, John John.Szumiloski at bms.com
Tue Feb 7 14:52:22 CET 2017

Dear useRs:

I am having difficulty understanding how to make a legend in ggplot when I only want certain geoms to be indicated, in the presence of nested groups.  An example:


  dat <- tibble(X=rep(seq(4),3),
                # fake data
                Y=c(-1.11, -0.46, 0.02, 0.81,
                    -0.51,  0.43, 0.73, 1.39,
                    -0.12,  0.62, 1.19, 1.99
                G1=rep(seq(3), each=4) %>% factor)

  dat <- dat %>% mutate(lin=predict(lm(Y~X*G1, dat)),
                        quad=predict(lm(Y~poly(X,2)*G1, dat)))

Now dat contains one grouping variable, G1,  the X and Y data, and two columns of fitted values.  I want to consolidate the fitted value columns for plotting:

  # stack model fits: make wide -> long
  dat <- dat %>% gather(lin:quad, key="Model", value="fit", factor_key=TRUE)

Thus the model variable acts as another grouping variable.

I want to plot the two fits over the raw data, with the raw data grouped and annotated by G1, and the fits grouped by model*G1 but annotated by only model.  Thus each G1 will have separately annotated model fits plotted, but the same model annotations will be the same for all levels of G1.  Here is the code that I thought would do this.

  # init plot
  pl <- ggplot(data=dat, mapping=aes(x=X, y=Y, group=interaction(Model,G1))) +  theme_bw()

    # add raw data in background
  pl <- pl +
    geom_path(data=dat %>% filter(Model=='lin'),  # filter probably not necessary but prevents redundant overplotting
              mapping=aes(x=X, y=Y, group=G1, color=G1), linetype=2, show.legend=FALSE) +
    geom_point(data=dat %>% filter(Model=='lin'),
               mapping=aes(x=X, y=Y, group=G1, color=G1, shape=G1), show.legend=FALSE)

    # add fits
    pl <- pl + geom_path(aes(x=X, y=fit, group=interaction(Model,G1), color=Model) )

The plot looks as I want it.  But the legend is titled G1, and has the levels of G1 in the legend (as well as the desired Model levels).  But I thought turning off the show.legend argument in the raw data geoms would prevent this.  What I desire in the legend is only the two levels of Model (and titled as such).

Any assistance greatly appreciated.
John Szumiloski, Ph.D.
Principal Scientist, Statistician
Pharmaceutical Development / Analytical and Bioanalytical Operations<http://teams.bms.com/sites/ARD/>

Bristol-Myers Squibb
P.O. Box 191
1 Squibb Drive
New Brunswick, NJ

(732) 227-7167

This message (including any attachments) may contain con...{{dropped:19}}

More information about the R-help mailing list