[R] Summary vs fivenum results for Q3

Ken Knoblauch knoblauch at lyon.inserm.fr
Tue Oct 9 17:32:50 CEST 2007


Schaefer, Robert L. Dr. <schaefrl <at> muohio.edu> writes:

> I've just started using R and am still a neophyte, but I found the 
following curious result.  I'm using 
the
> current version of R (2.5.1 (2007-06-27) ).
> 
> Why are the results for the third quartile different in the output 
from the summary and fivenum 
commands? 
> For the following data set
> 
> 457     514     530     530     538     560     687     745     745     
778     786     790     792     821     
821     822     822
> 828     845     850     886     886     886     913     1050    1050    
1065    1065    1065    1065    
1090    1130
> 
> Summary yields:
> 
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   457.0   745.0   822.0   825.4   947.2  1130.0
> 
> While fivenum yields:
> 
> [1]  457.0  745.0  822.0  981.5 1130.0
> 
> The third quartile is being correctly calculated in the 
fivenum command and incorrectly in the 
summary command.
> 
> Bob

If you look in ?boxplot.stats, it says:

The two “hinges” are versions of the first and third quartile, 
i.e., close to quantile(x, c(1,3)/4). The 
hinges equal the quartiles for odd n 
(where n <- length(x)) and differ for even n. 
Where the quartiles 
only equal observations for n %% 4 == 1 (n = 1 mod 4), 
the hinges do so additionally for n %% 4 == 2 
(n = 2 mod 4), and are in the middle of two observations 
otherwise.

I got here by looking a summary.default and seeing that it 
uses the quantile function
and then looking at fivenum to see that it did not.  
Looking at the help for fivenum
led me to boxplot.stats where I was that it w
as not necessarily doing the same thing.

HTH

-- 
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
Département Neurosciences Intégratives
18 avenue du Doyen Lépine
69500 Bron
France



More information about the R-help mailing list