[R] Unexpected behavior from hist()

Sarah Goslee sarah.goslee at gmail.com
Thu Jun 13 17:35:56 CEST 2013


On Thu, Jun 13, 2013 at 11:13 AM, Mohamed Badawy <mbadawy at pm-engr.com> wrote:
> Hi... I'm still a beginner in R. While doing some curve-fitting with a raw data set of length 22,000, here is what I had:
>> hist(y,col="red")
> gives me the frequency histogram, 13 total rectangles, highest is near 5000.

You don't provide a reproducible example, so here's some fake data:

somedata <- runif(1000)

> Now
>> hist(y,prob=TRUE,col="red",ylim=c(0,1.5))
> gives me the density (probability?) histogram, same number f rectangles, but the highest rectangle is obviously higher than 1, how can this be?!!!

Because you misread the help. using freq=FALSE (equivalent to
prob=TRUE, which is a legacy option), you are getting:

freq: logical; if ‘TRUE’, the histogram graphic is a representation
          of frequencies, the ‘counts’ component of the result; if
          ‘FALSE’, probability densities, component ‘density’, are
          plotted (so that the histogram has a total area of one).
          Defaults to ‘TRUE’ _if and only if_ ‘breaks’ are equidistant
          (and ‘probability’ is not specified).

It sounds like what you actually want is:

somehist <- hist(somedata, plot=FALSE)
somehist$counts <- somehist$counts/sum(somehist$counts)

> P.S. I had to post this thread via email as it got rejected as I posted it from Nabble, reason was "Message rejected by filter rule match"

Nabble is not the R-help mailing list. Posting via email is the
correct thing to do.


Sarah Goslee

More information about the R-help mailing list