[BioC] bioconductor on EMR / mapreduce

Dan Tenenbaum dtenenba at fhcrc.org
Mon Sep 24 19:50:55 CEST 2012

On Mon, Sep 24, 2012 at 9:42 AM, seth redmond <seth.redmond at pasteur.fr> wrote:
> I'm trying to install some bioC modules on EC2 / Elastic Mapreduce but I'm running into some library errors when installing (error below). Whilst I could install them locally on each machine, if possible I'd rather avoid the overhead both in terms of bootstrapping the machines, and having to check for library errors whenever I write a new method.
> Does anyone have any experience of running bioC in the cloud in this manner, and has tried, for instance, building a library in an S3 bucket and running directly from there, or porting the R lib wholesale when starting up the nodes? or is it possible to use the BioC AWS image in EMR somehow?

>From what I have been able to tell, AWS EMR is not very usable with R.
It takes longer to load packages on each mapper/reducer than it does
to run the calculation I am trying to parallelize.

I've looked at other strategies like RHIPE, or good old MPI.

> thanks
> -s
>> * Installing *source* package 'DNAcopy' ...
>> ** libs
>> gfortran   -fpic  -g -O2 -c changepoints.f -o changepoints.o
>> gcc -std=gnu99 -I/usr/share/R/include      -fpic  -g -O2 -c flchoose.c -o flchoose.o
>> gcc -std=gnu99 -I/usr/share/R/include      -fpic  -g -O2 -c fphyper.c -o fphyper.o
>> gcc -std=gnu99 -I/usr/share/R/include      -fpic  -g -O2 -c fpnorm.c -o fpnorm.o
>> gfortran   -fpic  -g -O2 -c getbdry.f -o getbdry.o
>> gfortran   -fpic  -g -O2 -c hybcpt.f -o hybcpt.o
>> gfortran   -fpic  -g -O2 -c prune.f -o prune.o
>> gcc -std=gnu99 -I/usr/share/R/include      -fpic  -g -O2 -c rshared.c -o rshared.o
>> gfortran   -fpic  -g -O2 -c segmentp.f -o segmentp.o
>> gcc -std=gnu99 -shared  -o DNAcopy.so changepoints.o flchoose.o fphyper.o fpnorm.o getbdry.o hybcpt.o prune.o rshared.o segmentp.o  -lgfortran -lm -L/usr/lib64/R/lib -lR
>> /usr/bin/ld: cannot find -lgfortran
>> collect2: ld returned 1 exit status
>> make: *** [DNAcopy.so] Error 1
>> ERROR: compilation failed for package 'DNAcopy'
>> ** Removing '/home/hadoop/R/x86_64-pc-linux-gnu-library/2.7/DNAcopy'
>> The downloaded packages are in
>>         /tmp/RtmpxSeilp/downloaded_packages
> --
> Seth Redmond
>   Unité Génetique et Génomique des Insectes Vecteurs
>   Institut Pasteur
>   28,rue du Dr Roux
>   75724 PARIS
> seth.redmond at pasteur.fr
>         [[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

More information about the Bioconductor mailing list