[R] Histogram omitting/collapsing groups

Joshua Wiley jwiley.psych at gmail.com
Sun Jan 1 17:34:26 CET 2012


Hi Aren,

I was busy thinking about how to make what you wanted, and I missed
that you were working with hours from a day.  That being the case, you
may think about a circular graph.  The attached plots show two
different ways of working with the same data.

Cheers,

Josh

set.seed(10)
x <- sample(0:23, 10000, TRUE, prob = sin(0:23)+1)

require(ggplot2) # graphing package

## regular barplot
p <- ggplot(as.data.frame(table(x)), aes(x = x, y = Freq)) +
  geom_bar()

## using circular coordinates
p2 <- p + coord_polar()

## print them
print(p)
print(p2)


## just if you're interested, the code to
## put the two plots side by side
require(grid)

dev.new(height = 6, width = 12)
grid.newpage()
pushViewport(vpList(
  viewport(x = 0, width = .5,  just = "left", name = "barplot"),
  viewport(x = .5, width = .5, just = "left", name="windrose")))
seekViewport("barplot")
grid.draw(ggplotGrob(p))
seekViewport("windrose")
grid.draw(ggplotGrob(p2))


On Sun, Jan 1, 2012 at 7:59 AM, Aren Cambre <aren at arencambre.com> wrote:
> On Sun, Jan 1, 2012 at 5:29 AM, peter dalgaard <pdalgd at gmail.com> wrote:
>> Exactly. If what you want is a barplot, make a barplot; histograms are for continuous data.   Just remember that you may need to set the levels explicitly in case of empty groups: barplot(table(factor(x,levels=0:23))). (This is irrelevant with 100K data samples, but not with 100 of them).
>>
>> That being said, the fact that hist() tends to create breakpoints which coincide with data points due to discretization is arguably a bit of a design error, but it is age-old and hard to change now. One way out is to use truehist() from MASS, another is to explicitly set the breaks to intermediate values, as in hist(x, breaks=seq(-.5, 23.5, 1))
>
> Thanks, everybody. I'll definitely switch to barplot.
>
> As for continuous, it's all relative. Even the most continuous dataset
> at a scale that looks pretty to humans may have gaps between the
> values when you "zoom in" a lot.
>
> Aren



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plots.pdf
Type: application/pdf
Size: 14593 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120101/acc1be4d/attachment.pdf>


More information about the R-help mailing list