[BioC] Running DESeq with 1000 samples

Steve Lianoglou lianoglou.steve at gene.com
Wed Jul 9 21:27:51 CEST 2014


Hi,

On Wed, Jul 9, 2014 at 11:58 AM, Maoqi Xu [guest]
<guest at bioconductor.org> wrote:
> Hi,
> I'm using DESeq to find the differential expressed genes between 2 populations. The RNA-seq data set has a total sample size of around 1000. However, even I set the memory limit of R to 6 Gb, it still reports the error that it cannot allocate vector of certain size. I wonder if it's possible to use DESeq on this huge data set and how much memory should be enough.

First: if you're just starting your project, you should prefer to use DESeq2

Second: you'll need some serious horsepower -- someone will likely
swoop in with a precise calculation, but I wouldn't expect this to
work on a machine w/ 8gb of RAM -- maybe 16gb would be enough, but if
you're routinely working on data at this scale I hope you've got a big
iron machine with ~ 64gb or more ram.

One option would be to do the "hard bits" on Amazon's cloud using
bioconductor's latest and greatest AMI:

http://www.bioconductor.org/help/bioconductor-cloud-ami/

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioconductor mailing list