[BioC] multicore Vignette or HowTo??
edwin.groot at biologie.uni-freiburg.de
Tue Oct 19 14:49:30 CEST 2010
On Mon, 18 Oct 2010 09:39:22 -0700
Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 10/18/2010 09:05 AM, Edwin Groot wrote:
> > Hello all,
> > I have difficulty getting the multicore package doing what it
> > Does anybody have a benchmark that demonstrates something intensive
> > with and without multicore assistance?
> > I have a dual dual-core Xeon, and $ top tells me all R can squeeze
> > my Linux system is 25% us. Here is my example:
> >> pnorm <- mcparallel(normalize.Probes(array, method = "loess"))
> Here's my favorite test of parallel functionality
> > library(multicore)
> > system.time(lapply(1:4, function(i) Sys.sleep(1)))
> user system elapsed
> 0.001 0.000 4.004
> > system.time(mclapply(1:4, function(i) Sys.sleep(1)))
> user system elapsed
> 0.007 0.005 1.009
> time goes 4x faster!
Hmm, a great parlour trick!
> Code has to be multicore-aware, and saying something like
> pnorm <- mcparallel(normalize.Probes(array, method = "loess"))
> array_norm <- collect(pnorm)
> just says to fork a process to do the task, not to do the task in
> parallel (multicore doesn't do anything clever, like identify parts
Ahah, I am ignorantly using this multicore package. It shows how little
I know about what happens under-the-hood with the software. I asked
this clueless question in the first place because I need some real data
and code that demonstrated the principle of parallel computation.
What I gave as an example was trivial, as it is a single process,
If I get this right, I have to find a way to split my data into (up to
4 in my case) parts and have mcparallel() distribute their load?
Hmm, but that would not work for normalization, because all the
information from the data set is needed. Now what?
> the code that could be parallelized). The Starr author would have to
> implement normalize.Probes to take advantage of multiple cores, or
> own task would have to be parallelizable at the 'user' level, like an
> I'm really not sure why array_norm is NULL. after looking at the
> on ?normalize.Probes I did
I think I entered array_norm <- collect(pnorm) twice, which probably
throws out the contents from the first collect() call.
> >> Normalizing probes with method: loess
> > Done with 1 vs 2 in iteration 1
> > #Function continues for some time and displays more messages. No
> > benefit from multicore. $ top reports 25% us during the run...
> >> array_norm <- collect(pnorm)
> > #Oh dear, where did my normalized data go?
> >> array_norm
> > $`4037`
> > NULL
> >> sessionInfo()
> > R version 2.11.1 (2010-05-31)
> > x86_64-pc-linux-gnu
> > locale:
> >  LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> >  LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> >  LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8
> >  LC_PAPER=en_GB.UTF-8 LC_NAME=C
> >  LC_ADDRESS=C LC_TELEPHONE=C
> >  LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
> > attached base packages:
> >  tools grid stats graphics grDevices utils
> > datasets
> >  methods base
> > other attached packages:
> >  geneplotter_1.26.0 annotate_1.26.1 AnnotationDbi_1.10.2
> >  Starr_1.4.4 affxparser_1.20.0 affy_1.26.1
> >  Ringo_1.12.0 Matrix_0.999375-39 lattice_0.18-8
> >  limma_3.4.4 RColorBrewer_1.0-2 Biobase_2.8.0
> >  multicore_0.1-3
> > loaded via a namespace (and not attached):
> >  affyio_1.16.0 DBI_0.2-5 genefilter_1.30.0
> >  MASS_7.3-6 preprocessCore_1.10.0 pspline_1.0-14
> >  RSQLite_0.9-2 splines_2.11.1 survival_2.35-8
> >  tcltk_2.11.1 xtable_1.5-6
> > RTFMing only gives me the syntax of some functions in the multicore
> > package. How do I apply successfully this thing to my code?
> > Regards,
> > Edwin
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> Location: M1-B861
> Telephone: 206 667-2793
Dr. Edwin Groot, postdoctoral associate
Institut fuer Biologie III
79104 Freiburg, Deutschland
More information about the Bioconductor