Barplot() with log axis scaling and other features

Marc Schwartz mschwartz@medanalytics.com
Thu, 19 Sep 2002 17:51:47 -0500


This is a multi-part message in MIME format.
--------------090009040901000308090002
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit


To all,

Recently I created a version of barplot() to include some new features 
as follows:

1. Plot confidence intervals for each bar: The user can pass lower 
(ci.l) and upper (ci.u) bounds in the same vector/matrix format as 
"height".  The CI boundary lines will vary in width per bar, if a 
varying "width" argument is specified. CI line width, type and color can 
be defined.  Defaults to 1, "solid" and black, respectively. Works with 
horizontal or vertical plot. Will work with beside = TRUE. Will not work 
if stacked bar plot results at this time. Argument plot.ci = c("TRUE", 
"FALSE") defines whether these are drawn.

2. Color plot region separately from the background area, if desired. 
Argument "prcol" defaults to NULL.

3. Draw "height" axis grid lines *behind* the bars.  Grid line width, 
type and color can be specified. Defaults to 1, "dotted" and black 
respectively. Argument plot.grid = c("TRUE", "FALSE") defines if the 
grid is plotted.

I have made these modifications with the intent to not "break" any 
existing code, so that identical calls to this version should work the 
same as the current implementation.

The current full function argument list is as follows:

barplot2 <- function(height, ...) UseMethod("barplot2")

barplot2.default <-
function(height, width = 1, space = NULL, names.arg = NULL,
      legend.text = NULL, beside = FALSE, horiz = FALSE,
      density = NULL, angle = 45,
      col = heat.colors(NR), prcol = NULL, border = par("fg"),
      main = NULL, sub = NULL, xlab = NULL, ylab = NULL,
      xlim = NULL, ylim = NULL, xpd = TRUE, log = "",
      axes = TRUE, axisnames = TRUE,
      cex.axis = par("cex.axis"), cex.names = par("cex.axis"),
      inside = TRUE, plot = TRUE,
      plot.ci = FALSE, ci.l = NULL, ci.u = NULL,
      ci.color = "black", ci.lty = "solid", ci.lwd = 1,
      plot.grid = FALSE, grid.inc = 5,
      grid.lty = "dotted", grid.lwd = 1, grid.col = "black",
      ...)


In late August, there was a post from Rengie Chan with a query on using 
log axis scaling with barplot(), which is presently precluded as coded 
as per my reply to his post.  I also noted that Paul Murrell has a note 
about this on his to-do page at: 
http://www.stat.auckland.ac.nz/~paul/R/graphicstodos.html

I have spent some time over the past couple of weeks or so (thinking 
that this was going to be easier than it turned out to be) working on 
this as time has permitted.

I now have barplot2() working with both x and/or y axis log scaling, for 
single bars, multiple bars ("beside = TRUE") and for stacked bars.

Some of the key issues are of course preventing log(<=0) related errors 
in the axis scaling, which I have done and incorporating the other 
features that I have added above to be compatible with log scaling. 
These include adjusting the par("usr") values as well as the bar 
baseline values when stacked for log scaling and so forth. Also, the 
"hatch" type bar shading does not work with a log scale, so an error 
check is included to catch this. 

I have attached the R code for this version, called barplot2(), for the 
time being.  I make this code available to R-Core if so desired, and to 
the community at large.  If anyone finds any bugs, let me know.

The areas that I am still a bit unsure of (and open to criticism on) are 
as follows:

1. When using log scaling, I have arbitrarily set the lower limit of the 
"height" related axis to 0.9 * min(height), to enable some of the 
smallest bar to actually be plotted. This figure is adjusted to include 
0.9 * min(ci.l) when plotting CI's.  This can be over-ridden by setting 
either ylim or xlim explicitly of course.  If anyone feels that this 
should be different, let me know.

2. When plotting grid lines, I have resorted to using pretty() to define 
the axis tick marks, labels and grid lines by default.  This took some 
time, after I noted posts to R-Help on the issues with par("xaxp") and 
par("yaxp") when log scaling is used. I also noted Ross Ihaka's 
sci.axis() function. I wrestled with trying to be consistent in the 
behavior of the grid lines for both linear and log scaling and differing 
approaches.

However, given that both barplot() and more significantly the axis() C 
code in plot.c forces "lty = 0" as presently coded, I wanted to give the 
user the ability to visually "de-emphasize" the gridlines with 
alternative line types, widths and colors. Thus I am using abline() with 
intervals set by pretty(), with the user able to specify how many 
intervals using "grid.inc", with a default of 5 as in pretty(). If 
anyone has any better thoughts on how to approach this, I am open.

I was trying to figure out a way to call the C code in plot.c that 
generates the default axis intervals (CreateAtVector), but would get 
errors indicating that the function is not in the load table.  This 
would seem to be a cleaner approach at first glance, since it would 
result in consistent default axis interval behavior when not plotting 
grid lines.  I am open to pointers or other thoughts on this.

Finally, there are two .NotYetUsed arguments in barplot().

First is "border", which defaults to par("fg").  I presume that this is 
simply meant to be the "border" argument to rect()?  If so, I can make 
that change easily.

Second is the "inside" argument.  If I read the help file correctly, 
this will draw a line between adjacent bars.  However, I am unsure if 
this is intended to be a "grouping" line that runs the full height or 
width of the plotting region such that it divides a group of bars (ie. 
beside = TRUE) or if it is simply a heavier line between each bar.  If 
someone can clarify this, I can work on this as well.

I appreciate any feedback from the group and hope that this work is of 
value to users of barplot().

Best regards,

Marc Schwartz



--------------090009040901000308090002
Content-Type: text/plain;
 name="barplot2.r"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="barplot2.r"

barplot2 <- function(height, ...) UseMethod("barplot2")

barplot2.default <-
function(height, width = 1, space = NULL, names.arg = NULL,
       legend.text = NULL, beside = FALSE, horiz = FALSE,
       density = NULL, angle = 45,
       col = heat.colors(NR), prcol = NULL, border = par("fg"),
       main = NULL, sub = NULL, xlab = NULL, ylab = NULL,
       xlim = NULL, ylim = NULL, xpd = TRUE, log = "",
       axes = TRUE, axisnames = TRUE,
       cex.axis = par("cex.axis"), cex.names = par("cex.axis"),
       inside = TRUE, plot = TRUE,
       plot.ci = FALSE, ci.l = NULL, ci.u = NULL,
       ci.color = "black", ci.lty = "solid", ci.lwd = 1,
       plot.grid = FALSE, grid.inc = 5,
       grid.lty = "dotted", grid.lwd = 1, grid.col = "black",
       ...)
{
    if (!missing(inside)) .NotYetUsed("inside", error = FALSE)
    if (!missing(border)) .NotYetUsed("border", error = FALSE)

    if (missing(space))
      space <- if (is.matrix(height) && beside) c(0, 1) else 0.2

    space <- space * mean(width)

    if (plot && axisnames && missing(names.arg))
      names.arg <- if(is.matrix(height)) colnames(height) else names(height)

    if (is.vector(height))
    {
      height <- cbind(height)
      beside <- TRUE
    }
    else if (is.array(height) && (length(dim(height)) == 1))
    {
      height <- rbind(height)
      beside <- TRUE
    }
    else if (!is.matrix(height))
      stop("`height' must be a vector or a matrix")

    if(is.logical(legend.text))
    {
      if(legend.text && is.matrix(height))
        legend.text <- rownames(height)
      else
        legend.text <- NULL
    }

    # Check for log scales
    logx <- FALSE
    logy <- FALSE

    if (log != "")
    {
      if (any(grep("x", log)))
        logx <- TRUE

      if (any(grep("y", log)))
        logy <- TRUE
    }

    # Cannot "hatch" with rect() when log scales used
    if ((logx || logy) && !is.null(density))
      stop("Cannot use shading lines in bars when log scale is used")

    NR <- nrow(height)
    NC <- ncol(height)

    if (beside)
    {
      if (length(space) == 2)
        space <- rep(c(space[2], rep(space[1], NR - 1)), NC)

      width <- rep(width, length = NR * NC)
    }
    else
      width <- rep(width, length = NC)

    delta <- width / 2
    w.r <- cumsum(space + width)
    w.m <- w.r - delta
    w.l <- w.m - delta

    #if graphic will be stacked bars, do not plot ci
    if (!beside && (NR > 1) && plot.ci)
      plot.ci = FALSE

    # error check ci arguments
    if (plot && plot.ci)
    {
      if ((missing(ci.l)) || (missing(ci.u)))
        stop("confidence interval values are missing")

      if (is.vector(ci.l))
        ci.l <- cbind(ci.l)
      else if (is.array(ci.l) && (length(dim(ci.l)) == 1))
        ci.l <- rbind(ci.l)
      else if (!is.matrix(ci.l))
        stop("`ci.l' must be a vector or a matrix")

      if (is.vector(ci.u))
        ci.u <- cbind(ci.u)
      else if (is.array(ci.u) && (length(dim(ci.u)) == 1))
        ci.u <- rbind(ci.u)
      else if (!is.matrix(ci.u))
        stop("`ci.u' must be a vector or a matrix")

      if (dim(height) != dim(ci.u))
        stop("'height' and 'ci.u' must have the same dimensions.")

      if (dim(height) != dim(ci.l))
        stop("'height' and 'ci.l' must have the same dimensions.")
    }

    if (beside)
      w.m <- matrix(w.m, nc = NC)

    # check height/ci.l if using log scale to prevent log(<=0) error
    # adjust appropriate ranges and bar base values
    if ((logx && horiz) || (logy && !horiz))
    {
      if (min(height) <= 0)
        stop("log scale error: at least one 'height' value <= 0")

      if (plot.ci && (min(ci.l) <= 0))
        stop("log scale error: at least one lower c.i. value <= 0")

      if (logx && !is.null(xlim) && (xlim[1] <= 0))
        stop("log scale error: 'xlim[1]' <= 0")

      if (logy && !is.null(ylim) && (ylim[1] <= 0))
        stop("'log scale error: 'ylim[1]' <= 0")

      # arbitrary adjustment to display some of bar for min(height) or min(ci.l)
      if (plot.ci)
        rectbase <- rangeadj <- (0.9 * min(c(height, ci.l)))
      else
        rectbase <- rangeadj <- (0.9 * min(height))

      # if axis limit is set to < above, adjust bar base value
      # to draw a full bar
      if (logy && !is.null(ylim))
        rectbase <- ylim[1]
      else if (logx && !is.null(xlim))
        rectbase <- xlim[1]

      # if stacked bar, set up base/cumsum levels, adjusting for log scale
      if (!beside)
        height <- rbind(rectbase, apply(height, 2, cumsum))

      # if plot.ci, be sure that appropriate axis limits are set to include range(ci)
      lim <-
        if (plot.ci)
          c(height, ci.l, ci.u)
        else
          height
    }
    else
    {
      # Use original bar base value
      rectbase <- 0

      # if stacked bar, set up base/cumsum levels
      if (!beside)
        height <- rbind(rectbase, apply(height, 2, cumsum))

      # if plot.ci, be sure that appropriate axis limits are set to include range(ci)
      lim <-
        if (plot.ci)
          c(height, ci.l, ci.u)
        else
          height

      # use original range adjustment factor
      rangeadj <- (-0.01 * lim)
    }

    # define xlim and ylim, adjusting for log-scale if needed
    if (horiz)
    {
      if (missing(xlim)) xlim <- range(rangeadj, lim, na.rm=TRUE)
      if (missing(ylim)) ylim <- c(min(w.l), max(w.r))
    }
    else
    {
      if (missing(xlim)) xlim <- c(min(w.l), max(w.r))
      if (missing(ylim)) ylim <- range(rangeadj, lim, na.rm=TRUE)
    }

    if(plot) ##-------- Plotting :
    {
      opar <-
        if (horiz)
          par(xaxs = "i", xpd = xpd)
        else
          par(yaxs = "i", xpd = xpd)

      on.exit(par(opar))

      plot.new()
      plot.window(xlim, ylim, log = log, ...)

      # if prcol specified, set plot region color
      if (!missing(prcol))
      {
        usr <- par("usr")

        # adjust par("usr") values if log scale(s) used
        if (logx)
        {
          usr[1] <- 10 ^ usr[1]
          usr[2] <- 10 ^ usr[2]
        }

        if (logy)
        {
          usr[3] <- 10 ^ usr[3]
          usr[4] <- 10 ^ usr[4]
        }

        rect(usr[1], usr[3], usr[2], usr[4], col = prcol)
      }

      # if plot.grid, draw major y-axis lines if vertical or x axis if horizontal
      # Due to issues with xaxp and yaxp when using log scale, use pretty() to set
      # grid and axis increments for for both linear and log scales when plotting grid lines
      if (plot.grid)
      {
        par(xpd = FALSE)

        if (horiz)
        {
          grid = pretty(xlim, n = grid.inc)
          abline(v = grid, lty = grid.lty, lwd = grid.lwd, col = grid.col)
        }
        else
        {
          grid = pretty(ylim, n = grid.inc)
          abline(h = grid, lty = grid.lty, lwd = grid.lwd, col = grid.col)
        }

         par(xpd = xpd)
      }

      # Beware : angle and density are passed using R scoping rules
      xyrect <- function(x1,y1, x2,y2, horizontal = TRUE, ...)
      {
        if(horizontal)
          rect(x1,y1, x2,y2, angle = angle, density = density, ...)
        else
          rect(y1,x1, y2,x2, angle = angle, density = density, ...)
      }

      if (beside)
        xyrect(rectbase, w.l, c(height), w.r, horizontal=horiz, col = col)
      else
      {
        for (i in 1:NC)
          xyrect(height[1:NR, i], w.l[i], height[-1, i], w.r[i], horizontal=horiz, col = col)
      }

      if (plot.ci)
      {
        # CI plot width = barwidth / 2
        ci.width = width / 4

        if (horiz)
        {
          segments(ci.l, w.m, ci.u, w.m, col = ci.color, lty = ci.lty, lwd = ci.lwd)
          segments(ci.l, w.m - ci.width, ci.l, w.m + ci.width, col = ci.color, lty = ci.lty, lwd = ci.lwd)
          segments(ci.u, w.m - ci.width, ci.u, w.m + ci.width, col = ci.color, lty = ci.lty, lwd = ci.lwd)
        }
        else
        {
          segments(w.m, ci.l, w.m, ci.u, col = ci.color, lty = ci.lty, lwd = ci.lwd)
          segments(w.m - ci.width, ci.l, w.m + ci.width, ci.l, col = ci.color, lty = ci.lty, lwd = ci.lwd)
          segments(w.m - ci.width, ci.u, w.m + ci.width, ci.u, col = ci.color, lty = ci.lty, lwd = ci.lwd)
        }
      }

      if (axisnames && !is.null(names.arg)) # specified or from {col}names
      {
        at.l <-
        if (length(names.arg) != length(w.m))
        {
          if (length(names.arg) == NC) # i.e. beside (!)
            colMeans(w.m)
          else
            stop("incorrect number of names")
        }
        else w.m

        axis(if(horiz) 2 else 1, at = at.l, labels = names.arg, lty = 0, cex.axis = cex.names, ...)
      }

      if(!is.null(legend.text))
      {
        legend.col <- rep(col, length = length(legend.text))

        if((horiz & beside) || (!horiz & !beside))
        {
          legend.text <- rev(legend.text)
          legend.col <- rev(legend.col)
          density <- rev(density)
          angle <- rev(angle)
        }

	    xy <- par("usr")
	    legend(xy[2] - xinch(0.1), xy[4] - yinch(0.1),
             legend = legend.text, angle = angle, density = density,
             fill = legend.col, xjust = 1, yjust = 1)
      }

      title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...)

      # if axis is to be plotted, adjust for grid "at" values
      if (axes)
      {
        if(plot.grid)
          axis(if(horiz) 1 else 2, at = grid, cex.axis = cex.axis, ...)
        else
          axis(if(horiz) 1 else 2, cex.axis = cex.axis, ...)
      }

      invisible(w.m)

    }

    else w.m
}

--------------090009040901000308090002--

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._