[BioC] Rsamtools: readGAlignmentsFromBam weird behavior reading MAPQ and RNEXT fields

rubi [guest] guest at bioconductor.org
Tue Oct 15 01:41:14 CEST 2013


Hi,

I'm using readGAlignmentsFromBam where I defined the bam parameters to be scanned as follows:
class: ScanBamParam
bamFlag (NA unless specified):
bamSimpleCigar: FALSE
bamReverseComplement: FALSE
bamTag: NH, HI, AS, nM 
bamWhich: 0 elements
bamWhat: flag, mapq, mrnm, mpos, seq, qual

Using an example of a single alignment record I get the following:
ILLUMINA:223:D0CUVACXX:5:1104:11885:187863	0	chr11	96806772	255	44M	*	0	0	GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC	JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ	NH:i:1	HI:i:1	AS:i:43	nM:i:0

                                             seqnames strand       cigar    qwidth     start       end     width      ngap |      flag      mapq     mrnm      mpos
                                                 <Rle>  <Rle> <character> <integer> <integer> <integer> <integer> <integer> | <integer> <integer> <factor> <integer>
  ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 chr11      +         44M        44  96806772  96806815        44         0 |         0      <NA>     <NA>         0

                                                                                      seq                                         qual        NH        HI        AS        nM
                                                                           <DNAStringSet>                               <PhredQuality> <integer> <integer> <integer> <integer>
  ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ         1         1        43         0


As you can see the MAPQ field in the input is 255 but is NA in the gapped alignment object, although mapq is defined in the ScanBamParam. The RNEXT field is also NA but I guess this is the representation of its "*" value in the bam record.

When I change the MAPQ in the input to be different from 255 (e.g., 3) this doesn't happen.

 -- output of sessionInfo(): 

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.10     Rsamtools_1.13.44     Biostrings_2.29.19    GenomicRanges_1.13.45 XVector_0.1.4         IRanges_1.19.38       BiocGenerics_0.7.5   

loaded via a namespace (and not attached):
[1] bitops_1.0-6   stats4_3.0.2   tools_3.0.2    zlibbioc_1.7.0


--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list