[BioC] readGAlignmentPairs error

Hervé Pagès hpages at fhcrc.org
Sat Sep 28 01:55:05 CEST 2013


Hi Leonard,

Good timing. I was actually going to get back to you today about this
because yesterday I fixed a bug in findMateAlignment() that I think was
causing the problem you originally reported on the mailing list. It's
also probably what is causing this new problem you are reporting below.
Totally agree with you that even if using a 'which' argument can alter
the pairing, in any case it should produce an error. More generally the
intended behavior of readGAlignmentPairs() is to *drop* records that
don't have a mate and to *dump* ambiguous pairings. Not to choke on
them. The difference between "drop" and "dump" here is that what's
dumped can be examined latter with getDumpedAlignments(). See
?getDumpedAlignments for the details.

Please let me know if you still run into those issues with the latest
Rsamtools (1.13.45). Works fine for me on your test.bam file (thanks
for providing the file).

Cheers,
H.


On 09/27/2013 04:23 PM, Leonard Goldstein wrote:
> Hi Hervé
>
> Thanks for your and Valerie's comments on using 'which' with
> readGAlignmentPairs.
>
> I actually encountered the same error again without using 'which' (see
> below). I attached test.bam, which narrows down the problematic
> alignment(s) to around ~600. If readGAlignmentPairs could drop any
> unusual alignments rather than generating an error that would be really
> helpful.
>
> Many thanks for your help.
>
> Leonard
>
>
>  > readGAlignmentPairs(file = "test.bam")
> Error in makeGAlignmentPairs(galn, use.names = use.names, use.mcols =
> use.mcols) :
>    findMateAlignment() returned an invalid 'mate' vector
> In addition: Warning message:
> In findMateAlignment(x) :
>      2 alignments with ambiguous pairing were dumped.
>      Use 'getDumpedAlignments()' to retrieve them from the dump environment.
>  >
>  > sessionInfo()
> R version 3.0.0 (2013-04-03)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] Rsamtools_1.13.44     Biostrings_2.29.19    GenomicRanges_1.13.45
> [4] XVector_0.1.4         IRanges_1.19.38       BiocGenerics_0.7.5
>
> loaded via a namespace (and not attached):
> [1] bitops_1.0-6   stats4_3.0.0   zlibbioc_1.7.0
>  >
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list