[R] Impaired boxplot functionality - mean instead of median
Martin Maechler
maechler at stat.math.ethz.ch
Fri Dec 2 08:36:02 CET 2005
{diverted back to R-help}
There are several R packages that provide plots of
"mean +/- SD" (or "mean +/- 2*SD" which is an approximate 95%
confidence interval for the case of normally distributed data)
or so called "error bars".
E.g. function plotCI() in package 'gplots' and errbar() in
package 'Hmisc' or 'sfsmisc'.
I'm very convinced that boxplots shouldn't be (mis!)used for
drawing those (and they are not by the above functions).
Regards,
Martin
>>>>> "Evgeniy" == Evgeniy Kachalin <ka4alin at yandex.ru>
>>>>> on Thu, 01 Dec 2005 19:39:18 +0300 writes:
Evgeniy> Martin Maechler ïèøåò:
>> Boxplots were invented by John W. Tukey and I think should be
>> counted among the top "small but smart" achievements from the
>> 20th century. Very wisely he did *not* use mean and standard deviations.
>>
>> Even though it's possible to draw boxplots that are not boxplots
>> (and people only recently explained how to do this with R on this
>> mailing list), I'm arguing very strongly against this.
>>
>> If I see a boxplot - I'd want it to be a boxplot and not have
>> the silly (please excuse) 10%--------90% whiskers which
>> declare 20% of the points as outliers {in the boxplot sense}.
>>
>> If you want the mean +/- sd plot, do *not* misuse boxplots
>> for them, please!
>>
Evgeniy> So I analize genetics data. I have some factor
Evgeniy> (gene variant, c(1,2,3)) and the quantitative
Evgeniy> variable corresponding to that factor. How do I
Evgeniy> visualize this situation? Compare mean of samples
Evgeniy> corresponding to factor values?
Evgeniy> Should boxplot support 'mean-in-the-middle', it
Evgeniy> would fit my needs ideally. How do I plot mean +/-
Evgeniy> SD plot?
Evgeniy> Also there is a way to rewrite boxplot.stats and
Evgeniy> replace "fivenum" there for self-made
Evgeniy> function. Then I would need to write self-made
Evgeniy> boxplot.formula (or boxplot.default?) function. And
Evgeniy> all this stuff would not be configurable. I'm still
Evgeniy> novice in R, so I need simple way to pre-visualize
Evgeniy> my data and estimate approximate result.
yes, there are ways, but no, I pretty strongly oppose the idea
to misuse the boxplot graphics for depicting very different identities.
More information about the R-help
mailing list