[R] Whiskers on the default boxplot {graphics}
Peter Ehlers
ehlers at ucalgary.ca
Thu May 13 18:10:50 CEST 2010
David,
try this:
fivenum(1:101)
quantile(1:101, c(1,3)/4, type=5)
-Peter
On 2010-05-13 8:55, David Winsemius wrote:
>
> On May 13, 2010, at 10:25 AM, Robert Baer wrote:
>
>>> Hi Peter,
>>>
>>> You're absolutely correct! The description for 'range' in 'boxplot'
>>> help file is a little bit confusing by using the words "interquartile
>>> range". I think it should be changed to the "length of the box" to be
>>> exact and consistent with those in the help file for "boxplot.stats".
>>
>> The issue is probably that there are multiple ways (9 to be exact) of
>> defining quantiles in R. See 'type= ' arguement for ?quantile. The
>> quantile function uses type=7 by default which matches the quantile
>> definition used by S-Plus(?), but differs from that used by SPSS.
>> Doesn't fivenum essentially use the equivalent of a different "type= "
>> arguement (maybe 2 or 5) in constructing the hinges?
>>
>> It seems perfectly reasonable to talk about 'length of box' (or 'box
>> height' depending how you display the boxplot), but aren't the hinges
>> simply Q1 and Q3 defined by one of the possible quartile definitions
>> (as Peter points out the one used by fivenum)? The box height does not
>> necesarily match the distance produced by IQR() which also seems to
>> use the equivalent of quantile(..., type=7), but it is still an IQR,
>> is it not?
>>
>> Quantiles apparantly can be defined in more than one "acceptable" way
>> (sort of like dealing with ties in rank statistics). The OP seemed to
>> want an "exact" explanation of the wiskers, and I think Peter has
>> pointed us at the definition of quartiles used by fivenum, as opposed
>> to the default used with quantile(..., "type=7").
>
> Yes, and experimentation leads me to the conclusion that the only
> possible candidate for matching up the results of fivenum[c(2,4] with
> quantile(y, c(1,3)/4, type=i) is for type=5. I'm not able to prove that
> to myself from mathematical arguments. since I do not quite understand
> the formalism in the quantile page. If the match is not exact, this
> would be a tenth definition of IQR.
>
> > set.seed(123)
> > y <- rexp(300, .02)
> > fivenum(y)
> [1] 0.2183685 15.8740466 42.1147820 74.0362517 360.5503788
> > for (i in 4:9) {print(quantile(y, c(1,3)/4, type=i) ) }
> 25% 75%
> 15.82506 73.93080
> 25% 75%
> 15.87405 74.03625
> 25% 75%
> 15.84955 74.08898
> 25% 75%
> 15.89854 73.98352
> 25% 75%
> 15.86588 74.05383
> 25% 75%
> 15.86792 74.04943
>
--
Peter Ehlers
University of Calgary
More information about the R-help
mailing list