[R] help to speed up loops in r

Bert Gunter gunter.berton at gene.com
Mon Jun 8 18:36:06 CEST 2009


AFAICS, the problem is that you have not (carefully?) read the docs and are
approaching R as a C programmer and not taking a whole object point of view.
The loop is completely unnecessary! PLEASE READ AN INTRO TO R
(again,perhaps) before posting.

So, unless I misunderstand (and my apologies if I do), what you want is
simply:

ix <- seq(from=2,to=40, by=2)
averagedResults <- (zz[,ix] + zz[,ix+1])/2

This was instantaneous on my computer for a 95000 x 40 matrix.  So if I got
it right, this is a pretty impressive example of why doing your homework by
first reading the docs is important.

Bert Gunter
Genentech nonclinical Statistics







-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Jorge Ivan Velez
Sent: Monday, June 08, 2009 9:03 AM
To: Amit Patel
Cc: r-help at r-project.org
Subject: Re: [R] help to speed up loops in r

Dear Amit,
The following should get you started:

# Some data
set.seed(123)
X <- matrix(rnorm(20*10), ncol=10)
X

# Group of replicates
g <- rep(1:(ncol(X)/2), each=2)
g

# Mean of replicate variables
t(apply(X, 1, tapply, g, mean, na.rm = TRUE))

I created a grouping variable (g) and then calculate the mean by row (
apply(X, 1,...) ) for each level of g (that's why I included tapply).

I have not checked timing but I guess it is faster than the script you
already have.

HTH,

Jorge



On Mon, Jun 8, 2009 at 11:45 AM, Amit Patel <amitrhelp at yahoo.co.uk> wrote:

>
> Hi
> i am using a script which involves the following loop. It attempts to
> reduce a data frame(zz) of 95000 * 41 down to a data frame
> (averagedreplicates) of 95000 * 21 by averaging the replicate values as
you
> can see in the script below. This script however is very slow (2days). Any
> suggestions to speed it up.
>
> NB I have also tried using rowMeans rather than adding the 2 values and
> dividing by 2. (same problem)
>
>
>
>
> #SCRIPT STARTS
> for (i in 1:length(averagedreplicates[,1]))
> #for (i in 1:dim(averagedreplicates)[1])
> {
> cat(i,'\n')
>
>
> #calculates Meanss
> #Sample A
> averagedreplicates[i,2] <- (zz[i,2] + zz[i,3])/2
> averagedreplicates[i,3] <- (zz[i,4] + zz[i,5])/2
> averagedreplicates[i,4] <- (zz[i,6] + zz[i,7])/2
> averagedreplicates[i,5] <- (zz[i,8] + zz[i,9])/2
> averagedreplicates[i,6] <- (zz[i,10] + zz[i,11])/2
>
> #Sample B
> averagedreplicates[i,7] <- (zz[i,12] + zz[i,13])/2
> averagedreplicates[i,8] <- (zz[i,14] + zz[i,15])/2
> averagedreplicates[i,9] <- (zz[i,16] + zz[i,17])/2
> averagedreplicates[i,10] <- (zz[i,18] + zz[i,19])/2
> averagedreplicates[i,11] <- (zz[i,20] + zz[i,21])/2
>
> #Sample C
> averagedreplicates[i,12] <- (zz[i,22] + zz[i,23])/2
> averagedreplicates[i,13] <- (zz[i,24] + zz[i,25])/2
> averagedreplicates[i,14] <- (zz[i,26] + zz[i,27])/2
> averagedreplicates[i,15] <- (zz[i,28] + zz[i,29])/2
> averagedreplicates[i,16] <- (zz[i,30] + zz[i,31])/2
>
> #Sample D
> averagedreplicates[i,17] <- (zz[i,32] + zz[i,33])/2
> averagedreplicates[i,18] <- (zz[i,34] + zz[i,35])/2
> averagedreplicates[i,19] <- (zz[i,36] + zz[i,37])/2
> averagedreplicates[i,20] <- (zz[i,38] + zz[i,39])/2
> averagedreplicates[i,21] <- (zz[i,40] + zz[i,41])/2
>  }
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list