[R] Efficient computation of trimmed stats?

Tue May 15 14:30:31 CEST 2007

the following seems a bit better:

set.seed(1)
nc <- 30
nr <- 25000
x <- matrix(rnorm(nc*nr), ncol = nc)
g <- matrix(sample(1:3, nr*nc, rep = TRUE), ncol = nc)

#################################

trimmedMeanByGroup1 <- function(y, grp, trim=.05)
   tapply(y, factor(grp, levels=1:3), mean, trim=trim)

trimmedMeanByGroup2 <- function(y, grp, trim = .05){
   unlist(lapply(split(y, grp), mean, trim = trim))
}

out1 <- out2 <- matrix(0, nr, 3)
system.time(for(i in 1:nr) out1[i, ] <- trimmedMeanByGroup1(x[i, ], 
g[i, ]))
system.time(for(i in 1:nr) out2[i, ] <- trimmedMeanByGroup2(x[i, ], 
g[i, ]))

all.equal(out1, out2)

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm

----- Original Message ----- 
From: "Benilton Carvalho" <bcarvalh at jhsph.edu>
To: "r-help at lists.r-project.org server posting" 
<r-help at stat.math.ethz.ch>
Sent: Monday, May 14, 2007 6:58 PM
Subject: [R] Efficient computation of trimmed stats?

> Hi everyone,
>
> I was wondering if there is anything already implemented for
> efficient ("row-wise") computation of group-specific trimmed stats
> (mean and sd on the trimmed vector) on large matrices.
>
> For example:
>
> set.seed(1)
> nc = 300
> nr = 250000
> x = matrix(rnorm(nc*nr), ncol=nc)
> g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)
>
> trimmedMeanByGroup <- function(y, grp, trim=.05)
>   tapply(y, factor(grp, levels=1:3), mean, trim=trim)
>
> sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))
>
> works fine... but:
>
> > system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g
> [i,])))
>    user  system elapsed
> 399.928   0.019 399.988
>
> does not look interesting for me.
>
> Maybe some package has some implementation of the above?
>
> Thank you very much,
> -b
>
> --
> Benilton Carvalho
> PhD Candidate
> Department of Biostatistics
> Bloomberg School of Public Health
> Johns Hopkins University
> bcarvalh at jhsph.edu
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm