[R] problem with hist()

hadley wickham h.wickham at gmail.com
Fri Jun 15 09:38:25 CEST 2007


On 6/15/07, Mario Dejung <forum at dejung.net> wrote:
> > On 6/14/07, Mario Dejung <forum at dejung.net> wrote:
> >> Hey everybody,
> >> I try to make a graph with two different plots.
> >>
> >>
> >> First I make a boxplot of my data. It is a collection off correlation
> >> values of different pictures. For example:
> >>
> >> 0.23445 pica
> >> 0.34456 pica
> >> 0.45663 pica
> >> 0.98822 picb
> >> 0.12223 picc
> >> 0.34443 picc
> >> etc.
> >>
> >> Ok, I make this boxplot and I get for every picture the boxes. After
> >> this
> >> I want to know, how many correlations per picture exist.
> >> So I make a new vector y <- as.numeric(data$picture)
> >>
> >> So I get for my example something like this:
> >>
> >> y
> >> [1] 1 1 1 1 1 1 1 1 1 1
> >> [11] 1 1 1 1 1 1 1 1 2 2
> >> ...
> >> [16881] 59 59 59 60 60 60 60 60 60 60
> >>
> >> After this I make something like this
> >>
> >> boxplot(cor ~ pic)
> >> par(new = TRUE)
> >> hist(y, nclass = 60)
> >>
> >> But there is my problem. I have 60 pictures, so I get 60 different
> >> boxplots, and I want the hist behind the boxes. But it makes only 59
> >> histbars.
> >>
> >> What can I do? I tried also
> >> hist(y, 1:60) # same effect
> >> and
> >> hist(y, 1:61)
> >> this give me 60 places, but only 59 bars. the last bar is 0.
> >>
> >> I hope anyone can help me.
> >
> > What does the y axis represent?  It will be counts for the histogram,
> > and correlations for the boxplots.  These aren't comparable, so you're
> > probably better off making two separate graphics.
> >
> > Hadley
> >
> The boxplots show only the median, min, max, etc of the different
> pictures, but I want to know, how many entry's are in this plot. Now I
> have done this by the hist function, and when I use different colors, you
> can see, for the first picture there are about 130 entry, but for the 8th
> picture, there are only 40 entry's...
> Doesn't make this sense?

I think your plot would be more clear if you used two graphics - one
showing the spread, and one showing the number of points (you might
also want to look at notched boxplots).  In the graphic you attached
the bars of the barchart (not histogram! - that's for continuous data)
distract the eye from the boxplots.  You might also want to try
ordering the x axis by mean or number of observations as this will
make it easier to see trends in the data.

The confusion with the barchart arises because there are really two
quite different types of barcharts.  One type is basically the same as
a dotchart, but you draw bars instead of dots - this is the default in
R.  The other type is the categorical analog of the histogram, and
this is the default in ggplot2
(http://had.co.nz/ggplot2/geom_bar.html), allow the next version will
automatically work out which version you want.

Hadley



More information about the R-help mailing list