[R] boxplot notches

P. B. Pynsent p.b.pynsent at bham.ac.uk
Tue Mar 2 16:16:23 CET 2004


A Google search showed  that all this was discussed in April 1988 with 
an extensive reply to the question from M Maechler.
I, as a non-statistician, blindly believed what was written in the 
boxplot() help file, I am sure many would be grateful to this help 
being modified.

I still do not understand why , 6 years later with GHz processors, 
boxplot() could not have an option to produce exact intervals. After 
all,  a range option is offered for the whiskers.
At least then non-overlapping notches would have some meaning, wouldn't 
they?

On 2 Mar 2004, at 10:18, Christoph Scherber wrote:

> Dear colleagues,
>
> I think it would be a good idea to include a short note in the R 
> boxplot() help file, stating exactly how the confidence levels are 
> calculated
> ("the notches are +/- 1.58 IQR/sqrt(n)")  - at least as a guidance for 
> users not advanced enough to directly interpret the code.
>
> Would this be possible?
>
> Regards,
> Christoph.
>
> David James wrote:
>
>> Prof Brian Ripley wrote:
>>
>>> On Mon, 1 Mar 2004, Martin Maechler wrote:
>>>
>>>>>>>>> "TL" == Thomas Lumley <tlumley at u.washington.edu>
>>>>>>>>> on Mon, 1 Mar 2004 09:54:48 -0800 (PST) writes:
>>>>>>>>
>>>> TL> On Mon, 1 Mar 2004, Christoph Scherber wrote:
>>>> >> Dear list members,
>>>> >>
>>>> >> Can anyone tell me how the notches in boxplot(Y~X,notch=T) are
>>>> >> calculated? What do these notches represent exactly? I´d suppose 
>>>> they
>>>> >> are Conficence Intervals for the median, but I´ve also been told 
>>>> they
>>>> >> might show Least Significant Difference (LSD) equivalents.
>>>>
>>>> TL> The help page says that
>>>> TL> " If the notches of two plots do not overlap then
>>>> TL> the medians are significantly different at the 5 percent level."
>>>>
>>>> TL> The only thing wrong with this is that it isn't true.
>>>> TL> The code says that the notches are +/- 1.58 IQR/sqrt(n),
>>>> TL> so I think the claimed confidence level holds only for
>>>> TL> normal distribuitons with small amounts of contamination.
>>>>
>>>> I think John Tukey's idea was that this formula (or just the fact of
>>>> using median and quartiles) is still often approximately correct
>>>> for quite a few kinds of moderate contaminations...
>>>
>>> It may be approximately correct for the width of a CI (and when I 
>>> checked
>>> it was only appproximately correct for a normal), but I would 
>>> seriously
>>> doubt if it were approximately correct for a significance level of 
>>> 5%.
>>> Remember how fast the tails of the asymptotic normal distribution 
>>> decay: a
>>> 20% error turns 5% into 2%.
>>>
>>> BTW, if there is a precise reference for this it would be good to 
>>> add it
>>> to boxplot.stats.Rd, as the confidence limits are unexplained there.
>>
>>
>> @article{McGi:Tuke:Lars:1978,
>> author = {McGill, Robert and Tukey, John W. and Larsen, Wayne A.},
>> title = {Variations of {B}ox plots},
>> year = {1978},
>> journal = {The American Statistician},
>> volume = {32},
>> pages = {12--16},
>> keywords = {Exploratory data analysis; Graphics}
>> }
>>
>> @book{Cham:Clev:Klei:Tuke:1983,
>> author = {Chambers, John M. and Cleveland, William S. and Kleiner, 
>> Beat
>> and Tukey, Paul A.},
>> title = {Graphical methods for data analysis},
>> year = {1983},
>> pages = {395},
>> publisher = {Wadsworth Publishing Co Inc}
>> }
>>
>>> -- 
>>> Brian D. Ripley, ripley at stats.ox.ac.uk
>>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
>>> University of Oxford, Tel: +44 1865 272861 (self)
>>> 1 South Parks Road, +44 1865 272866 (PA)
>>> Oxford OX1 3TG, UK Fax: +44 1865 272595
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! 
>>> http://www.R-project.org/posting-guide.html
>>
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>
>
P. B. Pynsent,
Research & Teaching Centre,
Royal Orthopaedic Hospital,
Northfield,
Birmingham, B31 2AP,
U. K.




More information about the R-help mailing list