[R] relative frequency plot

Martin Maechler maechler at stat.math.ethz.ch
Thu Apr 27 21:30:24 CEST 2006


>>>>> "Erik" == Erik Iverson <iverson at biostat.wisc.edu>
>>>>>     on Thu, 27 Apr 2006 13:44:16 -0500 writes:

    Erik> See ?truehist in the MASS package.

Not in this case!
truehist() also computes a density,
and its values on the "y axis" are not probabilities, either!
  hist(*, freq = FALSE)
is fully sufficient here -- the problem of the original poster
was to understand that a density can have values larger than 1.
It may be interesting and is somewhat disappointing for us
"teachers of statistics" to see how many people have posted in
the past on this exact topic, sometimes even more or less
assuming that R was doing some things wrongly because it showed
densities (or density estimates as here) with values larger than
one...  oh dear
    "Mit der Dummheit kaempfen Goetter selbst vergebens." 
     - Friedrich Schiller, "Die Jungfrau von Orleans"

Martin

    >>   Philipp Pagel wrote:
    >> On Thu, Apr 27, 2006 at 10:48:39AM -0700, nlei at sfu.ca
    >> wrote:
    >> 
    >>> Hi All,
    >>> 
    >>> I want to use "hist" to get the relative frequency
    >>> plot. But the range of ylab is greater than 1,which I
    >>> think it should be less than 1 since it stands for the
    >>> probability.
    >>> 
    >>> I'm confused. Could you please help me with it?
    >> 
    >> 
    >> I was pretty confused by that, too at first. The solution
    >> is that freq=False cause hist to plot the DENSITY rather
    >> than frequency. And density is not necesssarily the same
    >> as relative frequency. Excerpt from ?hist:
    >> 
    >> density: values f^(x[i]), as estimated density values. If
    >> 'all(diff(breaks) == 1)', they are the relative
    >> frequencies 'counts/n' and in general satisfy sum[i;
    >> f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.
    >> 
    >> If you want relative distance try something like this:
    >> 
    >> myhist = hist(x,breaks=52, plot=F) myhist$counts =
    >> myhist$counts / sum(myhist$counts)
    >> plot(myhist,main=NULL,border=TRUE,xlab="days",xlim=c(0,6),lty=2)
    >> 
    >> Not exactly clean, though -- we are messing with the
    >> myhist object...
    >> 
    >> 
    >> cu Philipp
    >> 

    Erik> ______________________________________________
    Erik> R-help at stat.math.ethz.ch mailing list
    Erik> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
    Erik> read the posting guide!
    Erik> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list