[R] ggplot2 histograms

Small Sandy (NHS Greater Glasgow & Clyde) sandy.small at nhs.net
Wed Dec 1 17:07:18 CET 2010


Sorry this should have ben to the whole list:

Hadley

I think I've sorted it out in my head but for the record, and just to be sure...
I guess what I was expecting was that the width parameter would be independent of binwidth. Thus a width parameter of 0.5 would always indicate an overlap of half the bar. In fact the width is determined as a fraction of the binwidth, so if width is greater than binwidth the overlap will be with adjacent bins not the bin it actually corresponds to.
So in my example you can completely separate the data by putting
ggplot(data=dafr, aes(x = d1, fill=d2)) + geom_histogram(binwidth = 2, position = position_dodge(width=7))
Obviously this isn't helpful.
I think the rules are:
1. the width of each bar equals binwidth divided by number of fill factors (in my case two)
2. total width of the visible bars would be centred on the centre of the bin
3. overlap of the visible bars is governed by the width parameter of position_dodge with 0 being complete overlap and binwidth being complete (but touching) separation (More than binwidth would then mean space between the bars - and presumably overlap with adjacent bars - I don't think this would ever be useful).
Hope this makes sense.
Sandy

Sandy Small
Clinical Physicist
NHS Forth Valley
(Tel: 01324567002)
and
NHS Greater Glasgow and Clyde
(Tel: 01412114592)
________________________________________
From: h.wickham at gmail.com [h.wickham at gmail.com] On Behalf Of Hadley Wickham [hadley at rice.edu]
Sent: 01 December 2010 14:27
To: Small Sandy (NHS Greater Glasgow & Clyde)
Cc: ONKELINX, Thierry; r-help at r-project.org
Subject: Re: [R] ggplot2 histograms

> However if you do:
> ggplot(data=dafr, aes(x = d1, fill=d2)) + geom_histogram(binwidth = 1, position = position_dodge(width=0.99))
>
> The position of first bin which goes from 0-2 appears to start at about 0.2 (I accept that there is some "white space" to the left of this) while the position of the last bin (16-18) appears to start at about 15.8, so the whole histogram seems to be wrongly compressed into the scale. In my real data which has potentially 250 bins the problem becomes much more pronounced. Has any one else noticed this? Is there a work around?

What do you expect this to do?  The bars are one unit wide, but you've
told position_dodge to treat them like they're only 0.99 units wide.

Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}



More information about the R-help mailing list