[R] Superposed histograms

Frank E Harrell Jr fharrell at virginia.edu
Fri Jan 10 12:58:03 CET 2003


On Fri, 10 Jan 2003 10:41:31 +0000 (GMT)
Damon Wischik <djw1005 at cam.ac.uk> wrote:

> 
> I woud like to plot cumulative histograms. Specifically,
> I have data like
>     Sex     M   M   F   M   F   F   M   F
>     Height  6   6.3 6.1 5.5 7.2 6.2 5.9 6.0  ....
> and I want to plot a histogram of the distribution of all heights,
> colouring the histogram bars according to sex, for example
>    
>   |    o   
>   |   oo  o  
>   | o oo ** o   o = observations of women
>   | o o*o***o   * = observations of men
>   | *o*******
>   |----------
> 
> (And I want this in a Trellis plot, and with more than two groups of
> observations.) How should I do this? I tried looking for imaginitive
> combinations of panel.superpose and panel.histogram. I suppose if I called
> panel.histogram for the cumulative data first, then panel.histogram for
> just the data on men, with a different colour, I could achieve the effect. 
> But I'd need to superpose the accumulated data, and panel.superpose seems
> to only separate the data by group, not accumulate data by group.
> 
> Damon Wischik.

I don't think this will be effective from a graphical perception point of view.  One problem is that the perception of the bottom symbols will be different than that of the symbols assigned to the upper region, because the upper symbols are not bottom-aligned.  I suggest usual multi-panel histograms or back-to-back histograms (see e.g. histbackback in the Hmisc library).  But better still would be superposed ECDFs (e.g., ecdf() in Hmisc or in Martin Maechler's package).  ECDFs are much better for showing distribution differences in my view.
-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat




More information about the R-help mailing list