[R] barplot2, gap.barplot

hadley wickham h.wickham at gmail.com
Fri Mar 2 21:57:59 CET 2007


On 3/2/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> On Fri, 2007-03-02 at 10:07 -0600, hadley wickham wrote:
> > On 3/2/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> > > On Fri, 2007-03-02 at 08:53 -0600, hadley wickham wrote:
> > > > > 3. Depending on the nature of your data, if the extreme value is
> > > > > representative of an important marked difference relative to the other
> > > > > values, then I don't particularly find the 'look' of the plot to be
> > > > > overly problematic. It does appropriately emphasize the large
> > > > > difference.
> > > > >
> > > > > On the other hand, you might want to consider using a log scale on the y
> > > > > axis as an alternative to an axis gap. This would be a reasonable
> > > > > approach to plotting values that have a notable difference in range.  If
> > > > > you do this, note that you would need to ensure that all y values are >0
> > > > > (ie. y axis range minimum, lower bounds of CI's, etc.) since:
> > > > >
> > > > > > log10(0)
> > > > > [1] -Inf
> > > > >
> > > > >
> > > >
> > > > Of course, you can't do this with a bar plot, because bars should be
> > > > anchored at 0.
> > >
> > > Both barplot() and barplot2() support log scaling for both x and y axes.
> > >
> > > In both functions, the default axis minimum for the 'height' axis (y by
> > > default, x if 'horizontal = TRUE') will be 0.9 * min(height) to avert
> > > log10(0) related issues. Errors will be issued otherwise if any values
> > > of 'height' are <= 0 or 'ylim'/'xlim' args are similarly set.
> >
> > I think that's a pretty bad idea - in a bar plot you are comparing the
> > ratio of heights of the bars, not the absolute heights.  It's the same
> > reason it's a bad idea to have a bar graph with a non-0 y-axis - it's
> > misleading.
>
> Hadley,
>
> I might note that even lattice will do this, arguably easier than
> barplot[2]():
>
> library(lattice)
> x <- 10 ^ (0:10)
> barchart(x ~ 0:10, horizontal = FALSE,
>          scales = list(y = list(log = 10)))
>
>
> Is it the right thing to do?  I'll leave that for others to debate. I
> have stronger feelings on the 'gapped axis' issue.

I think this is up there with double and gapped axes.  Although it's
much easier to resolve - just use a dot plot instead (which is
generally a pretty good rule whenever you want to use a bar plot)

> There have been requests for log scales on barplots on the R lists going
> back several years, which is one of the reasons that I wrote barplot2()
> some years ago.  It was also one of my first exercises in gaining a
> lower level understanding of R's graphics models.

People are always asking for things they don't really want! ;)

I (obviously) have pretty strong feelings about graphics - I don't
think you should be able to create meaningless (in some sense)
graphics.

Hadley



More information about the R-help mailing list