[BioC] Problem running summarizeOverlaps()

Martin Morgan mtmorgan at fhcrc.org
Thu May 22 17:10:23 CEST 2014


On 05/21/2014 07:39 PM, Jessica Perry Hekman wrote:
> On 05/20/2014 04:25 PM, Martin Morgan wrote:
>
>> If you do decide to update your R, summarizeOverlaps has moved to
>> GenomicAlignments.
>
> So I figured if I was going to be having this much trouble I might as well be on
> the latest R in case that helped. So I upgraded to 3.1.0 and the latest
> Bioconductor. I modified my script a bit as below. And now I have a new error
> message, at that same line that we had just gotten working in the old version
> (augh).
>
>
> source("http://bioconductor.org/biocLite.R")
> biocLite(c("GenomicFeatures", "GenomicAlignments", "pathview", "gage",
> "gageData", "Rsamtools", "leeBamViews"))
>
> # Build data object: Dog
> txdb.canFam3 <- GenomicFeatures::makeTranscriptDbFromUCSC("canFam3","refGene")
> exByGn <- GenomicFeatures::exonsBy(txdb.canFam3, "gene")
>
> # Read counts
> library(Rsamtools)
> fls <- list.files("../../bam/", pattern="-fox-readgroups.bam$", full.names=T)
>
> library(leeBamViews)
> bamfls <- BamFileList(fls, yieldSize=100000)
>
> flag <- scanBamFlag(isNotPrimaryRead=FALSE, isProperPair=TRUE)
> param <- ScanBamParam(flag=flag)
>
> # Count only reads which match exactly once ("Union")
> options(mc.cores=8)
> gnCnt <- GenomicAlignments::summarizeOverlaps(exByGn, bamfls, mode="Union",
>           ignore.strand=TRUE, single.end=TRUE, param=param)
>
> OUTPUT:
>
> Error in array(x, c(length(x), 1L), if (!is.null(names(x)))
> list(names(x),  :
>    'data' must be of a vector type, was 'NULL'

Hi Jessica -- sorry that this is again causing problems. Can you provide the 
output of

   traceback()

after the error occurs? This will help isolate where in the code things are 
going wrong.

Also on your own end and if you're feeling adventurous you could try to do

   options(error=recover)

and then when the error occurs you'll get a 'backtrace' of the function calls 
that are in effect when error occurs; you can select the call that seems most 
helpful (this can be a bit of an art, and requires some overall knowledge of the 
code). This brings you in to the '?browser' and you can look around to figure 
out in more detail what the data looks like that causes the error. Here's an 
artificial example:

     f <- function(x) log(x)
     g <- function(x) f(as.character(x))
     h <- function(x) g(x)

and an error

     > h(1)
     Error in log(x) : non-numeric argument to mathematical function

that we can understand how we got from where we invoked the call h(1) to where 
the error occurred

     > traceback()
     3: f(as.character(x))
     2: g(x)
     1: h(1)

We could then look at f, say

     > f
     function (x)
     log(x)

and not really understand what the problem was, so try to look at the value of 
'x' when we get to f

     > options(error=recover)
     > h("1")
     Error in log(x) : non-numeric argument to mathematical function

     Enter a frame number, or 0 to exit

     1: h("1")
     2: g(x)
     3: f(as.character(x))

     Selection: 3
     Called from: g(x)
     Browse[1]> ls()
     [1] "x"
     Browse[1]> x
     [1] "1"
     Browse[1]>

ah, the variable 'x' is somehow a character vector, and that's causing problems 
as we can confirm

     Browse[1]> log(x)
     Error during wrapup: non-numeric argument to mathematical function

We can get back out of the browser and the recover function, and reset options with

     Browse[1]> Q
     > options(error=NULL)

Like in this example, the problem is often up-stream of where the error 
occurred, and I guess that's where the art comes in -- to understand the code 
enough to see where the problem is, and to trace the issue to the origin.

Anyway, if you provide traceback() that'll give me a better start on 
understanding your problem.

Martin

>
>
> And, because all my packages have changed now:
>
>  > sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] leeBamViews_1.0.0    BSgenome_1.32.0      Biobase_2.24.0
>   [4] Rsamtools_1.16.0     Biostrings_2.32.0    XVector_0.4.0
>   [7] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2   IRanges_1.22.6
> [10] BiocGenerics_0.10.0  BiocInstaller_1.14.2
>
> loaded via a namespace (and not attached):
>   [1] AnnotationDbi_1.26.0    BatchJobs_1.2           BBmisc_1.6
>   [4] BiocParallel_0.6.0      biomaRt_2.20.0          bitops_1.0-6
>   [7] brew_1.0-6              codetools_0.2-8         DBI_0.2-7
> [10] digest_0.6.4            fail_1.2                foreach_1.4.2
> [13] GenomicAlignments_1.0.1 GenomicFeatures_1.16.0  iterators_1.0.7
> [16] plyr_1.8.1              Rcpp_0.11.1             RCurl_1.95-4.1
> [19] RSQLite_0.11.4          rtracklayer_1.24.1      sendmailR_1.1-2
> [22] stats4_3.1.0            stringr_0.6.2           tools_3.1.0
> [25] XML_3.98-1.1            zlibbioc_1.10.0
>  >
>
> I feel badly continuing to ask for help, but these error messages are beyond me.
> In this case, my best guess was that the message is saying that I got the
> signature of the method wrong -- but when I tried changing my arguments, I got a
> message saying "no method matches that signature," so I conclude that that's not
> the problem.
>
> Thanks,
> Jessica
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list