[R] Weighted.mean(x,wt) vs. t(x) %*% wt

Liaw, Andy andy_liaw at merck.com
Tue Jan 25 02:40:58 CET 2005


Just look at the code:

> weighted.mean
function (x, w, na.rm = FALSE) 
{
    if (missing(w)) 
        w <- rep.int(1, length(x))
    if (is.integer(w)) 
        w <- as.numeric(w)
    if (na.rm) {
        w <- w[i <- !is.na(x)]
        x <- x[i]
    }
    sum(x * w)/sum(w)
}
<environment: namespace:stats>

So the differences are:

- missing values handling
- weight normalization 
- the difference between t(x) %*% w and sum(x * w) (I'd say the latter is
more efficient)

Here's an example:

> x <- rnorm(5e6)
> w <- runif(x)
> w <- w / sum(w)
> system.time(sum(x * w), gcFirst=T)
[1] 0.17 0.03 0.20   NA   NA
> system.time(s1 <- sum(x * w), gcFirst=T)
[1] 0.19 0.01 0.20   NA   NA
> system.time(s2 <- t(x) %*% w, gcFirst=T)
[1] 0.30 0.01 0.33   NA   NA
> system.time(s3 <- crossprod(x, w), gcFirst=T)
[1] 0.04 0.00 0.04   NA   NA
> c(s1, s2, s3)
[1] -0.0008922782 -0.0008922782 -0.0008922782

[This is w/o using an optimized BLAS.  With an optimized BLAS, the two
latter ones might be significantly faster than what's seen here.]

Andy


> From: Ranjan S. Muttiah
> 
> What is the difference between the above two operations ?
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list