[R] boxplot with average instead of median

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue Aug 5 15:31:43 CEST 2008


Another option is to modify panel.bpplot in the Hmisc package and specify

library(lattice)
bwplot(..., panel=mypanel)

Note that panel.bpplot will show the mean.  It shows more quantiles than 
a standard box plot so you get more than a 3-number summary.

If you show the mean and standard deviation you are assuming much 
(especially symmetry) about the distribution you are trying to show.

Frank


S Ellison wrote:
> boxplot itself is hardwired to produce the boxplot.stats list, and that
> is not easy to change.
> 
> To get a different set of stats, you would need to do things in rwo
> stages:
> i) create a boxplot object of the type returned by boxplot, but using
> your own stats
> ii) call bxp on that object.
> 
> That's kind of tricky.
> 
> One comparatively simple alternative is to use the lattice package's
> bwplot, and specify an alternate function for the stats parameter. You
> have to write the alternate function, though. Here's one that would
> probably do something like what you want; it is intended to deliver
> boxes set to mean +-sd, outliers marked outside mean+-2.5sd by default
> and whiskers set to the outermost of mean+-sd or outermost non-outlier
> data.
> 
> Not that I'd recommend it, but it's entertaining writing it. With a bit
> more wrapping, it could be used to generate a bxp-like object as well,
> as per uwe's suggestion.
> 
> boxplot.norm<-function(x, do.conf=T, coef=1.5, do.out=T, p=0.05) {
> 	xx <- x[!is.na(x)]
> 	n <- length(xx)
> 	s<-sd(xx)
> 	m<-mean(xx)
> 	stats <- c(min(xx), m-s, m, m+s, max(xx) )
> 
> 	if(coef == 0 ) do.out <- FALSE
> 		#for compatibility with boxplot.stats
> 		
> 	if (do.out) {
> 		out <- abs(xx-mean(xx))/s > (coef+1) 
> 			#coef+1 gives outliers outside mean+-2.5s,
> because bwplot
> 			#passes its default coef=1.5 to the stats
> function and outlier 
> 			#marking at 2.5s is not a million miles from
> boxplot.stats's
> 			#lower/upper quartiles -/+ 1.5*iqr if normality
> is assumed
> 	} else {
> 		out <- numeric(0)
> 	}
> 	
> 	if (any(out)) 
> 		stats[c(1, 5)] <- range(xx[!out])
> 
> 	#and tidy up any silly whiskers... mean+-sd can be outside the
> outer data points
> 	stats[1]<-min(stats[1:2])
> 	stats[5]<-max(stats[4:5])
> 	
> 	conf <- if (do.conf && n>1) 
> 		stats[3] + c(-1,1) * s * qt(1-p/2, n-1)/sqrt(n)
> 	#Note: this is simply the (1-p)% confidence interval, not the
> notch width 
> 	#required for a pairwise test at (1-p)% confidence. If notches
> don't overlap, though,
> 	#you certainly have a significant difference at _at least_ the
> (1-p)% level.
> 	#But bwplot can't use it anyway, 'cos it doesn't do notches.
> 	
> 	list(stats = stats, n = n, conf = conf, out = xx[out] )
> }
> 
> 
> ##Try it out...
> require(lattice)
> x<-rnorm(100)
> g<-gl(5,20)
> bwplot(x~g, main="The default")
> 
> windows()
> bwplot(x~g, stats=boxplot.norm, main="Mean +- SD")
> 
> 
> 
> 
> 
>>>> Chad Junkermeier <junkermeier at byu.edu> 05/08/2008 05:36 >>>
> I really like the ease of use with the boxplot command in R.  I would 
> 
> rather have a boxplot that shows the average value and the standard  
> deviation then the median value and the quartiles.
> 
> Is there a way to do this?
> 
> 
> Chad Junkermeier, Graduate Student
> Dept. of Physics
> West Virginia University
> PO Box 6315
> 210 Hodges Hall
> Morgantown WV 26506-6315
> phone: (304) 293-3442 ext. 1430
> fax: (304) 293-5732
> email: chad.junkermeier{at}mail.wvu.edu
> -----------------------------------------------------
> Concurrently at:
> Dept. of Physics and Astronomy
> Brigham Young University
> Provo UT 84602
> email: junkermeier{at}byu.edu
> 
> cell: (801) 380-8895
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list