[R] label outliers in geom_boxplot (ggplot2)

hadley wickham h.wickham at gmail.com
Thu Jun 5 19:06:17 CEST 2008

2008/5/27 Mihalicza Péter <mihalicza.peter at eski.hu>:
> Dear List and Hadley,
> I would like to have a boxplot with ggplot2 and have the outlier values
> labelled with their "name" attribute. So I did
>     > library(ggplot2)
>   > dat=data.frame(num=rep(1,20), val=c(runif(18),3,3.5),
> name=letters[1:20])
>   > p=ggplot(dat, aes(y=val, x=num))+geom_boxplot(outlier.size=4,
> outlier.colour="green")
>   > p+geom_text(label=dat$name)
> But this -of course- labels all the data points. So I searched high and low
> to find the way to only label the outliers, but I couldn't find any
> solution. Probably my keywords were inappropriate, but I looked at the
> ggplot website and  the book also. So I did this:
>     > boxout=boxplot(dat$val)$out
>   > outname=as.character(dat$name)
>   > outname[(dat$val %in% boxout)==FALSE]="\n"
>   > p+geom_text(label=outname)
> This works, but seems like a hack to me. Is there an obvious solution that I
> am missing?

I don't think so.  This type of problem (where you need to
independently access the statistics generated by ggplot) does come up
fairly often, but I don't have any particularly good solution for it.

> Another thing: it seems, that if there is only one outlier, ggplot doesn't
> show it, although it adjusts the axis to it, and also plots the label, when
> doing geom_text():

That's a bug.  Thanks for pointing it out and it should be fixed in
the next version.


P.S.  Sorry for taking so long to respond, I've been at my sister's
wedding in New Zealand


More information about the R-help mailing list