[R] Whiskers on the default boxplot {graphics}

Shi, Tao shidaxia at yahoo.com
Wed May 12 21:27:15 CEST 2010


Jason, 

All these are clearly defined in the help file for 'boxplot' under 'range'.  Don't understand how you missed that.

...Tao




----- Original Message ----
> From: Jason Rupert <jasonkrupert at yahoo.com>
> To: Dennis Murphy <djmuser at gmail.com>
> Cc: R Project Help <R-help at r-project.org>
> Sent: Wed, May 12, 2010 3:40:12 AM
> Subject: Re: [R] Whiskers on the default boxplot {graphics}
> 
> Fantastic! 

It would be great if the description could be modified to 
> include the mysterious bit about the upper and lower bound whisker 
> positions:

upper whisker = min(max(x), Q_3 + 1.5 * IQR)
lower whisker 
> = max(min(x), Q_1 - 1.5 * IQR)

Maybe that is clearly written in the 
> description of boxplot.stats {grDevices}, but evidently I missed it numerous 
> times and also did not pick up on this intent from the original description of 
> boxplot {graphics}.  

Your type of descriptive answer and 
> helpfulness is much appreciated and one of the reasons I continue to endorse the 
> R tool over numerous others.  

More like you and the tool may be 
> headed for domination in the market. 

Thanks 
> again!






________________________________
From: 
> Dennis Murphy <
> href="mailto:djmuser at gmail.com">djmuser at gmail.com>

Cc: R Project 
> Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
Sent: Wed, 
> May 12, 2010 2:50:19 AM
Subject: Re: [R] Whiskers on the default boxplot 
> {graphics}

Hi:

Let's do some math 
> :)



e:

Okay...Let me see if I've got 
> it...
>
>>I'm just trying to use the default boxplot {graphics} 
> capability in R...
>
>>So I call something like the 
> following:
>>> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", 
> xlab="Number of Cylinders", ylab="Miles Per Gallon") \
>
>>That 
> produces something as shown in the 
> following:
>http://www.statmethods.net/graphs/images/boxplot1.jpg
>
>>When 
> that default boxplot is called, i.e. boxplot {graphics}, as shown in the line of 
> code above, it is actually calling into boxplot.stats {grDevices}.  When 
> boxplot.stats {grDevices} is called it has a default value for "coef" of 1.5, 
> i.e. coef = 1.5.
>
>>If I understand the purpose of "coef" 
> correctly, it means that the ‘whiskers’ should extend out 1.5 times the length 
> of the box away from the box.   Is that correct?
>

If by 
> 'length of the box' you mean the interquartile range (IQR = Q_3 - Q_1 where Q 
> refers to quartile), then assuming that
x is the numeric vector of interest 
> for a boxplot,

upper whisker = min(max(x), Q_3 + 1.5 * IQR)
lower 
> whisker = max(min(x), Q_1 - 1.5 * IQR)

So the upper whisker is located at 
> the *smaller* of the maximum x value and Q_3 + 1.5 IQR,
whereas the lower 
> whisker is located at the *larger* of the smallest x value and Q_1 - 1.5 
> IQR.

In your terms, the whiskers should extend out a *maximum* of "1.5 
> times the length of the box
away from the box". 

Visually, this means 
> that individual points more extreme in value than Q3 + 1.5 IQR are 
> plotted
separately at the high end, and those below Q1 - 1.5 IQR are plotted 
> separately on the low
end. Depending on the source, the separately plotted 
> points are called 'outside values'. On
the other hand, if the maximum or 
> minimum values of x are closer than 1.5 IQR in distance from
its nearest 
> quartile, then that is where the whisker is positioned.

Does that make 
> sense?

HTH,
Dennis


>>Now I look back at the plot, and 
> I'm not sure how 1.5 times the length of the box corresponds with the whisker 
> lengths shown in the image:
>
> href="http://www.statmethods.net/graphs/images/boxplot1.jpg" target=_blank 
> >http://www.statmethods.net/graphs/images/boxplot1.jpg
>
>>Is 
> it that the whisker length is a total of 1.5 the length of the box and centered 
> about the median (2nd Quartile)?
>
>>Just trying to get a handle 
> on this, so thanks again for all the help in deciphering 
> this.
>
>
>
>
>
>
>
>>________________________________
>>From: 
> RJ Cunningham <
> href="mailto:robut at iinet.net.au">robut at iinet.net.au>
>
>
> target="_blank" href="http://ast.net">ast.net>
>>Cc: R Project 
> Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>Sent: 
> Tue, May 11, 2010 9:57:48 PM
>
>Subject: Re: [R] Whiskers on the 
> default boxplot {graphics}
>
>
>I think not. Isn't the 
> "secret" here?
>
>
>>Arguments:
>
>>x: a 
> numeric vector for which the boxplot will be constructed
>>('NA's and 
> 'NaN's are allowed and omitted).
>
>>coef: this determines how 
> far the plot 'whiskers' extend out
>>from the box.  If 'coef' is 
> positive, the whiskers extend
>>to the most extreme data point which is 
> no more than
>>'coef' times the length of the box away from the box. 
> A
>>value of zero causes the whiskers to extend to the 
> data
>>extremes (and no outliers be 
> returned).
>
>>do.conf,do.out: logicals; if 'FALSE', the 'conf' 
> or 'out'
>>component respectively will be empty in the 
> result.
>
>>Details:
>
>>The two 'hinges' are 
> versions of the first and third quartile,...
>
>
>>On Wed 
> May 12 10:35 , Jason Rupert  sent:
>
>
>>Humm....Maybe 
> I need to look some place else than boxplot.stats {grDevices} for a definition 
> of how the upper/lower whiskers are 
> produced.
>>>
>>>>
>>>By any chance are 
> they "the lowest datum still within 1.5 IQR of the lower quartile, and the 
> highest datum still within 1.5 IQR of the upper 
> quartile"?
>>>
>>>>
>>>None of the links 
> from boxplot.stats {grDevices} seemed to reveal the secret definition of the R 
> whiskers.
>>>
>>>>
>>>Thanks 
> again.
>>>
>>>
>>>
>>>
>>>
>>>>
>>>----- 
> Original Message 
> ----
>>>>
>
>>>>
>
>>To: 
> David Winsemius <
> href="mailto:dwinsemius at comcast.net">dwinsemius at comcast.net>
>>>>
>>>Cc: 
> R Project Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>>>
>>>Sent: 
> Tue, May 11, 2010 9:26:25 PM
>>>>
>>>Subject: Re: [R] 
> Whiskers on the default boxplot 
> {graphics}
>>>
>>>>
>>>Wowzers...
>>>
>>>>
>>>From 
> ?boxplot.stats:
>>>
>>>>
>>>Details
>>>
>>>>
>>The 
> two ‘hinges’ are versions of the first and third quartile, i.e., close to 
> quantile(x, c(1,3)/4). The hinges equal the quartiles for odd n (where n <- 
> length(x)) and differ for even n. Whereas the quartiles only equal observations 
> for n %% 4 == 1 (n = 1 mod 4), the hinges do so additionally for n %% 4 == 2 (n 
> = 2 mod 4), and are in the middle of two observations 
> otherwise.
>
>>
>>>>
>>>The notches 
> (if requested) extend to +/-1.58 IQR/sqrt(n). This seems to be based on the same 
> calculations as the formula with 1.57 in Chambers et al. (1983, p. 62), given in 
> McGill et al. (1978, p. 16). They are based on asymptotic normality of the 
> median and roughly equal sample sizes for the two medians being compared, and 
> are said to be rather insensitive to the underlying distributions of the 
> samples. The idea appears to be to give roughly a 95% confidence interval for 
> the difference in two 
> medians.
>>
>>
>>>
>>>
>>>>
>>>Is 
> a notch equal to the upper/lower whisker?   Is this just a difference of 
> terminology or 
> something?
>>>
>>>>
>>>Thanks again for 
> all the 
> insights.
>>>
>>>
>>>
>>>
>>>>
>>>----- 
> Original Message ----
>>>>
>
>>From: David 
> Winsemius <
> href="mailto:dwinsemius at comcast.net">dwinsemius at comcast.net>
>>>>
>
>>>>
>>>Cc: 
> R Project Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>>>
>>>Sent: 
> Tue, May 11, 2010 9:00:15 PM
>>>>
>>>Subject: Re: [R] 
> Whiskers on the default boxplot 
> {graphics}
>>>
>>>
>>>>
>>>On 
> May 11, 2010, at 9:45 PM, Jason Rupert 
> wrote:
>>>
>>>>
>>>> How are the 
> lower/upper whiskers defined in the default version of boxplot 
> {graphics}?
>>>>
>>>>
>>>>
>>> 
> I tried help(boxplot) and searching 
> href="http://www.rseek.org">www.rseek.org, but I was unable to determine an 
> absolute answer.
>
>>
>>>>
>>>You need 
> to follow the links from the help pages and tin this case it appears that you 
> did not follow the one 
> to
>>>
>>>>
>>>?boxplot.stats
>>>
>>>>
>>>>
>>>>
>>> 
> I checked out the definition of boxplot according to Wikipedia 
> (http://en.wikipedia.org/wiki/Box_plot%5C), but it also had several 
> approaches
>
>>>
>>>> listed for how the 
> whiskers could be determined, so I'm just curious how the 
> default
>>>>
>>>> boxplot {graphics} does 
> it.
>>>>
>>>>
>>>>
>>>> 
> Thanks for any 
> feedback
>>>
>>>>
>>>Follow links with 
> the R help system.
>>>
>>>>
>>>> and 
> insights.
>>>
>>>
>>>
>>>>
>>>David 
> Winsemius, MD
>>>>
>>>West Hartford, 
> CT
>>>
>>>
>>>
>>>
>>>>
>>>______________________________________________
>
>>
> ymailto="mailto:R-help at r-project.org" 
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing 
> list
>>>
> target=_blank 
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>>>
>>>PLEASE 
> do read the posting guide 
> http://www.R-project.org/posting-guide.html
>>>>
>>>and 
> provide commented, minimal, self-contained, reproducible 
> code.
>>>
>>>
>>>
>>>
>>>
>>>>
>>>______________________________________________
>>>
> ymailto="mailto:R-help at r-project.org" 
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing 
> list
>>>
> target=_blank 
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>>>
>>>PLEASE 
> do read the posting guide 
> target=_blank 
> >http://www.R-project.org/posting-guide.html
>>>>
>>>and 
> provide commented, minimal, self-contained, reproducible 
> code.
>>>
>
>
>
>
>
    
>    [[alternative HTML version 
> deleted]]
>
>
>______________________________________________
>
> ymailto="mailto:R-help at r-project.org" 
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing 
> list
>
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the 
> posting guide 
> target=_blank >http://www.R-project.org/posting-guide.html
>>and 
> provide commented, minimal, self-contained, reproducible 
> code.
>
>



      
    
> [[alternative HTML version deleted]]


   


More information about the R-help mailing list