[BioC] makeGappedAlignmentPairs for broken pairs

Valerie Obenchain vobencha at fhcrc.org
Thu Jan 16 21:09:37 CET 2014


Hi Ido,

library(GenomicAlignments)
fl <- system.file("extdata", "ex1.bam", package="Rsamtools")

You can set different flags in ScanBamParam to control what is read in.
See ?ScanBamParam for details. Here is a 'param' that reads only records 
with unmapped mates.

param <- ScanBamParam(flag=scanBamFlag(hasUnmappedMate=TRUE))

Two classes can hold paired-end reads, one is GAlignmentPairs and the 
other is GAlignmentsList. They run the same mate-pairing code but return 
different objects.

## GAlignmentPairs:

Oops. Looks like there is a bug in readGAlignmentPairs() when this flag 
is set. We will fix that.
 > gapairs <- readGAlignmentPairs(fl, param=param)
Error in .combineBamFlagFilters(bamFlag(param, asInteger = TRUE), flag0) :
   BAM flag filters to combine are incompatible

## GAlignmentsList:

galist <- readGAlignmentsList(fl, param=param)
>> galist
> GAlignmentsList of length 127:
> [[1]]
> GAlignments with 1 alignment and 0 metadata columns:
>       seqnames strand cigar qwidth start  end width ngap
>   [1]     seq2      +   35M     35  1520 1554    35    0
>
> [[2]]
> GAlignments with 1 alignment and 0 metadata columns:
>       seqnames strand cigar qwidth start end width ngap
>   [1]     seq1      +   36M     36     6  41    36    0

There is a metadata column on the list that has the mate status. In this 
case they are all 'unmated'.
 > mcols(galist)
DataFrame with 127 rows and 1 column
     mate_status
        <factor>
1       unmated
2       unmated
3       unmated
4       unmated
5       unmated
...         ...

This code is available in the devel branch.

Valerie


 > sessionInfo()
R Under development (unstable) (2014-01-06 r64682)
Platform: x86_64-unknown-linux-gnu (64-bit)
...

other attached packages:
[1] GenomicAlignments_0.99.11 VariantAnnotation_1.9.28
[3] Rsamtools_1.15.18         Biostrings_2.31.7
[5] GenomicRanges_1.15.19     XVector_0.3.6
[7] IRanges_1.21.19           BiocGenerics_0.9.2

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.25.9   Biobase_2.23.3         biomaRt_2.19.1
  [4] bitops_1.0-6           BSgenome_1.31.7        DBI_0.2-7
  [7] GenomicFeatures_1.15.4 RCurl_1.95-4.1         RSQLite_0.11.4
[10] rtracklayer_1.23.6     stats4_3.1.0           tools_3.1.0
[13] XML_3.98-1.1           zlibbioc_1.9.0




On 01/16/2014 02:57 AM, Ido Tamir wrote:
> Hi,
> is there a roadmap for allowing paired end reads where only one is mapped to be reported?
> Basically I would like to have readGappedAlignmentPairs report them also.
> One could reread the whole file and take only those reads where the  makeGappedAlignmentPairs test
> was unsuccessfull.
>
> thank you very much,
> ido
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Valerie Obenchain

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B155
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: vobencha at fhcrc.org
Phone:  (206) 667-3158
Fax:    (206) 667-1319



More information about the Bioconductor mailing list