[R] Y axis of 1-D Linear Discriminant Histograms

Bob Farmer farmerb at dal.ca
Wed Nov 18 18:00:34 CET 2009


Hi all.
I would like to understand what are the units defined on the y-axis
when you plot the one-dimensional predictions (histograms) from lda()
(MASS) discriminant function objects?

While the helpfile suggests that a histogram is returned by default,
the presumably proportion-like values for each group seem to add up to
more than 1, and I'm not sure how to interpret the code from
ldahist(), which, I believe, defines the heights of each bin as

est1/(diff(breaks) * length(data[g == grp]))

where est1 is (as far as I can tell) the frequency within the bin, and
the denominator is apparently the bin width multiplied by the total
sample size for that panel.   It seems to be that a far more logical
result would be returned for each bin if the diff(breaks) component
was removed entirely.

While I don't think my concern affects the shape of each group's
histogram, I'd much prefer to display a more intuitive y-axis.

Example:
library(MASS)
ld1<-lda(Species ~ Sepal.Length + Sepal.Width, iris)
plot(ld1, type = "histogram", dimen = 1)
#(eyeballing it suggests that the sum of the "frequencies" reported on
the y-axis for each group exceeds 1)

Thanks very much.
--Bob Farmer
Dalhousie University




More information about the R-help mailing list