[R] box plot and plot whiskers

S Ellison S.Ellison at LGCGroup.com
Fri Jul 13 18:44:18 CEST 2012


 

> -----Original Message-----
> I have question concerning box plot and it's whiskers. As I 
> understood from the description of the boxplot() function, if 
> the range value is positive the plot whiskers extend out from 
> the box to the most extreme data points defined by the values 
> of  the IQR  times range (default 1.5).  
... from the box. For a normal distribution (N(mu, sigma) the expected position of the whisker ends would be at mu+-4*0.674*sigma (that corresponds to a two-tailed 99.3% interval, if I've not lost a factor of two somewhere).

> It suggests that the 
> upper and lower plot whiskers should be more less the same length.
>  What does it mean if they are not? How it's possible? 
The end of each whisker is always a data point in your data set. Data can be anywhere.

In small data sets (under 20 per group) the whiskers can vary quite a lot by chance; for example try
set.seed(1027)
y <- rnorm(150)
g <- gl(10,15)
boxplot(y~g)


#and note group 2.

In bigger data sets the quantiles are less variable and different whisker length, like the different lengths of the box parts, becomes a more reliable indicator of asymmetry. 

S Ellison



More information about the R-help mailing list