[R] multicore by(), like mclapply?

ivo welch ivo.welch at gmail.com
Mon Oct 10 20:54:26 CEST 2011


hi josh---thx.  I had a different version of this, and discarded it
because I think it was very slow.  the reason is that on each
application, your version has to scan my (very long) data vector.  (I
have many thousand different cases, too.)  I presume that by() has one
scan through the vector that makes all splits.

regards,

/iaw
----
Ivo Welch (ivo.welch at gmail.com)




On Mon, Oct 10, 2011 at 11:07 AM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
> Hi Ivo,
>
> My suggestion would be to only pass lapply (or mclapply) the indices.
> That should be fast, subsetting with data table should also be fast,
> and then you do whatever computations you will.  For example:
>
> require(data.table)
> DT <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> setkey(DT, x)
>
> lapply(as.character(unique(DT[,x])), function(i) DT[i])
>
> the DT[i] object is the subset of the data table you want.  You can
> pass this to whatever function for computations you need.
>
> Hope this helps,
>
> Josh
>
>
> On Mon, Oct 10, 2011 at 10:41 AM, ivo welch <ivo.welch at gmail.com> wrote:
>> dear r experts---Is there a multicore equivalent of by(), just like
>> mclapply() is the multicore equivalent of lapply()?
>>
>> if not, is there a fast way to convert a data.table into a list based
>> on a column that lapply and mclapply can consume?
>>
>> advice appreciated...as always.
>>
>> regards,
>>
>> /iaw
>> ----
>> Ivo Welch (ivo.welch at gmail.com)
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, ATS Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>



More information about the R-help mailing list