[R] superimposing histograms con't

Martin Maechler maechler at stat.math.ethz.ch
Thu Jun 29 17:37:47 CEST 2006


>>>>> "Bill" == Bill Shipley <bill.shipley at usherbrooke.ca>
>>>>>     on Wed, 28 Jun 2006 15:53:35 -0400 writes:

    Bill> Earlier, I posted the following question:

    Bill> I want to superimpose histograms from three
    Bill> populations onto the same graph, changing the shading
    Bill> of the bars for each population. After consulting the
    Bill> help files and the archives I cannot find out how to
    Bill> do this (seemly) simple graph. To be clear, I want

    Bill> - a single x axis (from -3 to 18)
    Bill> - three groups of bars forming the histograms of each population
    Bill>   (they will not overlap much, but this is a detail)
    Bill> - the bars from each histogram having different
    Bill>   shadings or other  visually distinguishing features.
 
    Bill> Gabor Grothendieck [ggrothendieck at gmail.com] pointed
    Bill> to some code to to this but I have found another way
    Bill> that works even easier.
 

    Bill> hist(x[sel1],xlim=c(a,b),ylim=c(A,B))  - this plots the histogram for the
    Bill> first group (indexed by sel1) but with an x axis and a y axis that spans the
    Bill> entire range.
 
    Bill> par(new=T)  - to keep on the same graph
 
    Bill> hist(x[sel2],main=Null,xlab=NULL,ylab=NULL,axes=F) -superimposes the second
    Bill> histogram
 
    Bill> par(new=T)  - to keep on the same graph
 
    Bill> hist(x[sel3],main=Null,xlab=NULL,ylab=NULL,axes=F) -superimposes the third
    Bill> histogram
 
Hmm, the above does not work (because of 'Null' instead of  'NULL')
but even if you fix that, I'm pretty sure you get pretty wrong
plots.

par(new = TRUE) is quite often a bad choice.
In the present case, plot 2 and 3 use different
coordinate systems than plot 1.

The "correct" solution -- apart from using  lattice::histogram()
with the correct 'superpose' panel is really to make use of the
fact that

  1) hist() returns a "histogram" (S3) object which is plotted by
     
  2) plot.histogram() which has a nice help page even though
     the funcion is namespace-hidden.

Here's a reproducible solution --- with quite a bit of extra
code in order to show two ``colorizations'' for overplotting,
in particular one with ``transparent colors'' :


##MM: construct reproducible example data
set.seed(1)
x <- c(rnorm(50), (rnorm(60) +3.5), (rnorm(40) +3.5 + 3))
str(grouping <- factor(c(rep(1,50), rep(2,60), rep(3,40))))
(n.gr <- length(table(grouping))) # 3

xr <- range(x)
## Compute all three histograms objects but without plotting:
histL <- tapply(x, grouping, hist, plot = FALSE)
maxC <- max(print( sapply(lapply(histL, "[[", "counts"), max)) ) # 14 15 15

pdf("3histo.pdf", version = "1.4") # >= 1.4 is needed
## or try the default x11()

if((TC <- transparent.cols <- .Device %in% c("pdf", "png"))) {
    cols <- hcl(h = seq(30, by=360 / n.gr, length = n.gr), l = 65,
                alpha = 0.5) ## << transparency
} else {
    h.den <- c(10, 15, 20)
    h.ang <- c(45, 15, -30)
}

## Plots the histogram for the
## first group (indexed by sel1) but with an x axis and a y axis that spans the
## entire range.
if(TC) {
    plot(histL[[1]], xlim = xr, ylim= c(0, maxC), col = cols[1],
         xlab = "x", main = "3 Histograms of 3 Selections from 'x'")
} else
    plot(histL[[1]], xlim = xr, ylim= c(0, maxC), density = h.den[1], angle = h.ang[1],
         xlab = "x", main = "Histogram of 3 selections from 'x'")
## Using density instead of color make these *add*

if(!transparent.cols) {
    for(j in 2:n.gr)
        plot(histL[[j]], add = TRUE, density = h.den[j], angle = h.ang[j])
} else { ## use semi-translucent colors (available only for PDF >= 1.4 and PNG devices):
    for(j in 2:n.gr)
        plot(histL[[j]], add = TRUE, col = cols[j])
}

if(.Device == "pdf") { ## you have set it above
  dev.off()
  system("gv 3histo.pdf &")## or use 'acroread' ;  'xpdf' does not show transparent colors...
}

1 ##--- Martin Maechler, ETH Zurich



More information about the R-help mailing list