[BioC] Problem running summarizeOverlaps()

Jessica Perry Hekman hekman2 at illinois.edu
Tue May 20 21:20:43 CEST 2014


On 05/20/2014 11:05 AM, Martin Morgan wrote:

>> Error: C stack usage is too close to the limit
>>

> I haven't seen this error before in the context of summarizeOverlaps, so
> it's a bit puzzling. I'd first check that the
>
>    fls <- list.files("../../bam/", pattern="fox-readgroups.bam$",
> full.names=T)
>
> all point to valid bam files, and the bam files have indexes.

* Yes, that directory is full of BAM files with their BAI indexes -- 
when I first ran the R script, I didn't have BAI files and it complained 
about that. I indexed the BAMs and then it was happy. Sample BAM 
filename: ../../bam/52_172-fox-readgroups.bam

> You might then try adding a 'yieldSize' argument to the following line,
> starting small (e.g., 100000) and moving toward the default (1000000) if
> the small size works when calling summarizeOverlaps, or perhaps smaller
> if it fails.
>
>    bamfls <- BamFileList(fls, yieldSize=100000)

Interesting -- that seems to have worked. The process is running much 
longer now, anyways, and I don't want to wait for it to finish because I 
suspect it might take hours. I'll try it again with a larger number once 
I'm sure it's exited successfully.

yieldSize seems to have to do with the number of records it will 
process. It's only 24 BAM files so I'm not sure whether "records" means 
"files". Does "records" mean "reads"?

> Can you provide a little information about your system? It sounds like
> it's your own machine, not a server. How much memory?

It's the lab server, and I have sudo access. 120G memory (about 80G free 
at the moment as other jobs are running). Fedora 20.

> Probably you'd get a different outcome with a more recent R /
> Bioconductor, but I'm not sure whether the error would go away! I have a
> sense that the problem with package manager installation is that they or
> you end up installing non-default packages into a single system
> directory, and as a consequence the directory contains a mix of
> different Bioconductor releases. A 'better practice' is probably to
>
>    a) remove any existing system-wide R installation and packages
>
>    b) install R with only base packages as su, or (as I do) install R as
> a regular user (not su) in version-specific directories in your own user
> file system, e.g., ~mtmorgan/bin/R-3-1-branch/
>
>    c) install any additional packages, via biocLite or otherwise, as a
> regular user, following R's prompt to create a version-specific
> directory in your own user hierarchy.
>
> Obviously this can be a rats nest of problems, and should only be done
> immediately before a big deadline or when you are feeling too productive
> and need to scale back ;)

Noted! I wonder about just installing the updated version locally so 
everyone else on the server (2 other people) can use the default 
version. I might try that. Leaving two versions scares me, of course.

Jessica



More information about the Bioconductor mailing list