[R] mean vs sum behavior

Gavin Simpson gavin.simpson at ucl.ac.uk
Mon Mar 31 19:02:03 CEST 2008


On Mon, 2008-03-31 at 18:41 +0200, Emmanuel Castella wrote:
> Dear all
> Could someone explain me why
> lapply(split(x,fac),mean)
> returns means per levels of fac for each column of x
> whereas
> lapply(split(x,fac),sum)
> does not return sums per level of fac and columns of x, but adds all 
> columns together?
> Hence, how can I get sum to behave as mean in this instance?
> Thank you very much in advance
> E. Castella

You didn't tell us what x is, but I suspect a data.frame. mean has a
method for class "data.frame", which returns the mean of each column.
Sum doesn't have any methods and hence works by summing all the numbers.

If you want to replicate the mean behaviour with sum, the following
would suffice:

> fac <- gl(4, 50)
> dat <- data.frame(a = rnorm(200), b = rnorm(200), c = rnorm(200))
> sp <- split(dat, fac)

then

> lapply(sp, function(x) sapply(x, sum))

or even quicker and easier:

> lapply(sp, colSums)

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list