[R] relative frequency plot

Erik Iverson iverson at biostat.wisc.edu
Thu Apr 27 22:13:08 CEST 2006


Martin -

Of course you are right.  The documentation for truehist (and hist) 
explains that fact nicely, which is why I thought to send him there. 
Sorry for any confusion.

Thanks,
Erik

Martin Maechler wrote:
>>>>>>"Erik" == Erik Iverson <iverson at biostat.wisc.edu>
>>>>>>    on Thu, 27 Apr 2006 13:44:16 -0500 writes:
> 
> 
>     Erik> See ?truehist in the MASS package.
> 
> Not in this case!
> truehist() also computes a density,
> and its values on the "y axis" are not probabilities, either!
>   hist(*, freq = FALSE)
> is fully sufficient here -- the problem of the original poster
> was to understand that a density can have values larger than 1.
> It may be interesting and is somewhat disappointing for us
> "teachers of statistics" to see how many people have posted in
> the past on this exact topic, sometimes even more or less
> assuming that R was doing some things wrongly because it showed
> densities (or density estimates as here) with values larger than
> one...  oh dear
>     "Mit der Dummheit kaempfen Goetter selbst vergebens." 
>      - Friedrich Schiller, "Die Jungfrau von Orleans"
> 
> Martin
> 
>     >>   Philipp Pagel wrote:
>     >> On Thu, Apr 27, 2006 at 10:48:39AM -0700, nlei at sfu.ca
>     >> wrote:
>     >> 
>     >>> Hi All,
>     >>> 
>     >>> I want to use "hist" to get the relative frequency
>     >>> plot. But the range of ylab is greater than 1,which I
>     >>> think it should be less than 1 since it stands for the
>     >>> probability.
>     >>> 
>     >>> I'm confused. Could you please help me with it?
>     >> 
>     >> 
>     >> I was pretty confused by that, too at first. The solution
>     >> is that freq=False cause hist to plot the DENSITY rather
>     >> than frequency. And density is not necesssarily the same
>     >> as relative frequency. Excerpt from ?hist:
>     >> 
>     >> density: values f^(x[i]), as estimated density values. If
>     >> 'all(diff(breaks) == 1)', they are the relative
>     >> frequencies 'counts/n' and in general satisfy sum[i;
>     >> f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.
>     >> 
>     >> If you want relative distance try something like this:
>     >> 
>     >> myhist = hist(x,breaks=52, plot=F) myhist$counts =
>     >> myhist$counts / sum(myhist$counts)
>     >> plot(myhist,main=NULL,border=TRUE,xlab="days",xlim=c(0,6),lty=2)
>     >> 
>     >> Not exactly clean, though -- we are messing with the
>     >> myhist object...
>     >> 
>     >> 
>     >> cu Philipp
>     >> 
> 
>     Erik> ______________________________________________
>     Erik> R-help at stat.math.ethz.ch mailing list
>     Erik> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>     Erik> read the posting guide!
>     Erik> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list