[R] how to parallelize 'apply' across multiple cores on a Mac
ccberry at ucsd.edu
Sat May 4 18:32:35 CEST 2013
David Romano <dromano <at> stanford.edu> writes:
> Hi everyone,
> I'm trying to use apply (with a call to zoo's rollapply within) on the
> columns of a 1.5Kx165K matrix, and I'd like to make use of the other cores
> on my machine to speed it up. (And hopefully also leave more memory free: I
> find that after I create a big object like this, I have to save my
> workspace and then close and reopen R to be able to recover memory tied up
> by R, but maybe that's a separate issue -- if so, please let me know!)
> It seems the package 'multicore' has a parallel version of 'lapply', which
> I suppose I could combine with a 'do.call' (I think) to gather the elements
> of the output list into a matrix, but I was wondering whether there might
> be another route.
[description of simple calc's deleted]
If you insist on explicitly parallelizing this:
The functions in the recommended package 'parallel' work on a Mac.
I would not try to work on each tiny column as a separate function call -
too much overhead if you parallelize - instead, bundle up 100-1000 columns
to operate on.
The calc's you describe are sound simple enough that I would just write
them in C and use the .Call interface to invoke them. You only need enough
working memory in C to operate on one column and space to save the result.
So a MacBook with 8GB of memory will handle it with room to breathe.
This is a good use case for the 'inline' package, especially if you are
unfamiliar with the use of .Call.
But it might be as fast to forget about paralleizing this (explicitly).
If !any(is.na(column.values)), then what you are doing can be achieved by
desired.means[ , column.subset] <-
crossprod( suitable.matrix, matrix.values )
or better still
desired.means[, column.subset] <-
where suitable.matrix implements your steps 2-6.
minimal.matrix is unique(suitable.matrix,MARGIN=2)
fill.rows is s.t minimal.matrix[fill.rows,] == suitable.matrix
matrix.values is a subset of columns from your original matrix
and column.subset is where the result should be placed in desired means.
On a Mac, the vecLib BLAS will do crossprod using the multiple
cores without your needing to do anything special. So you can forget about
'parallel', 'multicore', etc.
So your remaining problem is to reread steps 2=6 and figure out what
'minimal.matrix' and 'fill.rows' have to be.
You can also approach this problem using 'filter', but that can get
'convoluted' (pun intended - see ?filter).
More information about the R-help