[R] boxplot question

Peter Ehlers ehlers at ucalgary.ca
Sat Nov 21 01:40:25 CET 2009


If there's been an answer to this, I've missed it.
Here's my take.

Antje wrote:
> Hi there,
> 
> I was wondering if anybody can explain to me why the boxplot ends up 
> with different results in the following case:
> 
> I have some integer data as a vector and I compare the stats of boxplot 
> with the same data divided by a factor.
> 
> I've attached a csv file with both data present (d1, d2). The factor is 
> 34.16667.
> 
> If I run the boxplot function on d1 I get the following stats:
> 
> 0.848...
> 0.907...
> 0.936...
> 0.965...
> 1.024...
> 
> For d2 I get these stats:
> 
> 29
> 31
> 32
> 33
> 36
> 
> 
> If I convert the stats of d1 with the factor, I get
> 
> 29
> 31
> 32
> 33
> 35
> 
> Obviously different for the upper whisker. But why???
> 
> Antje

Antje:

Three comments:
1. I think your 'factor' is actually 205/6, not 34.16667.

2. This looks like another case of FAQ 7.31:

# Let's take your d2 and create d1; I'll call them x and y:
x <- rep(c(29:38, 40), c(7, 24, 50, 71, 24, 12, 14, 7, 13, 5, 1))
y <- x * 6 / 205

# x is your d2, sorted
# y is your d1, sorted

# The critical values are x[202:203] and y[202:203];
x[201:204]
#[1] 35 35 36 36

# The boxplot stats are:
sx <- boxplot.stats(x)$stats
sy <- boxplot.stats(y)$stats

# Calculate potential extent of upper whisker:
ux <- sx[4] + (sx[4] - sx[2]) * 1.5  #36
uy <- sy[4] + (sy[4] - sy[2]) * 1.5  #1.053658536585366

# Is y[203] <= uy?
y[203] <= uy
#[1] FALSE  #!!!

y[202] <= uy
#[1] TRUE

# For x:
x[203] <= ux
#[1] TRUE

And there's your answer: for y the whisker
goes to y[202], not y[203], due to the inevitable
imprecision in machine calculation.

3. last comment: I would not use boxplots for data like this.

  -Peter Ehlers


> 
> 
> ------------------------------------------------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list