[Rd] median and data frames

William Dunlap wdunlap at tibco.com
Fri Apr 29 18:09:14 CEST 2011


> From: r-devel-bounces at r-project.org 
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Martin Maechler
> Sent: Friday, April 29, 2011 7:25 AM
> To: Paul Johnson
> Cc: r-devel
> Subject: Re: [Rd] median and data frames
> [ ... lots of lines elided ... ] 
> My vote is for deprecating  mean.data.frame().

While R's data.frame method for mean(x) returns
the same thing as colMeans(x), Splus's (since 2005)
returns the same thing as mean(as.matrix(x)).  (Really,
it calls numerical.matrix(x), which turns non-numeric
columns into columns of numeric NA's).  I usually favor
making data.frames act more like matrices when possible
(since users often conflate the two classes) and I
like having all the methods of a generic function return
the same sort of thing (a single value in this case).

It is often nonsensical to ask for the mean of an
entire data.frame, as the columns may have different
units even when they are all numeric.  It does make
sense when you use a tool like read.table() or S+'s
importData() to import a matrix and you don't notice
it is stored as a data.frame.  It does make sense when
you have a single-column data.frame or matrix, perhaps
arising from the use of drop=FALSE when subscripting.  

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> Martin
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 



More information about the R-devel mailing list