[Rd] Error in R-Intro document (PR#13079)

Martin Maechler maechler at stat.math.ethz.ch
Sat Sep 27 18:25:26 CEST 2008


>>>>> "d" == davidhedin  <davidhedin at mac.com>
>>>>>     on Sat, 27 Sep 2008 07:10:06 +0200 (CEST) writes:

    d> Full_Name: David Hedin Version: R 2.6.0 GUI 1.21 OS: Mac
    d> 10.4.11 Submission from: (NULL) (24.205.60.123)


    d> page 64 of the R introduction document makes the claim

    d> "If the probability=TRUE argument is given, the bars
    d> represent relative frequencies instead of counts"

    d> This is wrong, the densities (relative frequency/class
    d> width) are given, not the relative frequency. It's only
    d> true when the class width is 1.

Thank you; I have added "divided by the bin width".
Note that one could argue that it really depends on the
definition of "relative" if paragraph you cite is really wrong.
I could define "relative" to mean
"relative WRT to the total and the bin width"  :-)


    D> What IS the code which will produce a relative frequency
    d> histogram?

Well, do you really want a y-axis scale which is neither
counts nor has the usual density scale?
I'd recommend against that.

Here's an example derived from  help(plot.histogram) :

 wwt <- hist(women$weight, nclass = 7, plot = FALSE)

 ## modify the result to show "relative frequencies"
 wt. <- wwt; wt.$density <- wwt$density * diff(wwt$breaks)[1]
 plot(wt., freq=FALSE, ylab="Relative Frequency")

 ## or probably rather 
 wtP <- wwt; wtP$density <- wwt$density * 100 * diff(wwt$breaks)[1]
 plot(wtP, freq=FALSE, ylab="Relative Frequency [ % ]")

But note that I would strongly advocate to use the default of
counts instead of the above, since from counts, one intuitively
gets a notion of *precision* (most people have a crude
approximation of the Poisson built in their brains :-)
which is completely lost when switching to percents.

Martin Maechler, ETH Zurich



More information about the R-devel mailing list