[R] For->lapply->parallel apply

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Apr 11 02:15:36 CEST 2011


Hi,

On Sat, Apr 9, 2011 at 5:03 AM, Alaios <alaios at yahoo.com> wrote:
> Dear all,
> I would like to ask your help understand the subsequent steps for making my program faster.
>
> The following code:
> Gauslist<-array(data=NA,dim=c(dimx,dimy,dimz))
> for (i in c(1:dimz)){
>    print(sprintf('Creating the %d map',i));
>    Gauslist[,,i]<-f <- GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha))
> }
>
>
> creates 100 GaussMaps (each map is of 256*256 dim) and stores them in a matrix called Gauslist.
>
> This process takes too long, so I was thinking if you can help me understand what should I do to make it run in parallel (in work there is a system with 16 cores).
>
> There is mclapply (parralel version of lapply) . If I make run my code run with lapply then I will be able to run it with mclapply also (they have same syntax).
> If I understand it correct the sequence for doing that is to understand the following:
>
> for..loop->lapply->mcapply
>
> Can you please help me understand if my for loop can be converted to lapply or not?

Your loop can be converted quite easily.

The lapply function simply takes an object to iterate over as its
first argument (this can be a list of things, a vector of things,
etc.) and a function to apply to each element in the iteration.
`lapply` will build a list of results that your function returns for
each element.

A simple example is to iterate over the words in a character vector
and return how many characters are in each word.

R> words <- c('cat', 'dog's, 'people')
R> sizes <- lapply(words, function(x) nchar(x))
R> sizes
[[1]]
[1] 3

[[2]]
[1] 4

[[3]]
[1] 6

So in your example:

> for (i in c(1:dimz)){
>    print(sprintf('Creating the %d map',i));
>    Gauslist[,,i]<-f <- GaussRF(x=x, y=y, model=model,
>      grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha))
> }

Could be something like:

gauslist <- lapply(1:dimz, function(i) {
  GaussRF(x=x, y=y, model=model, ... WHATEVER ELSE)
})

using mclapply would be exactly the same, except replace lapply with mclapply.

Actually, is it correct that you aren't doing anything different in
the iterations of the for loop -- I mean, nothing in your code really
depends on your value for `i`, right?

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list