[BioC] Lifting over a bam file with Rsamtools and GenomicRanges

rubi [guest] guest at bioconductor.org
Sun Sep 22 20:37:46 CEST 2013


Hi,

I have a bam file that was a aligned to a non-reference genome and I have the coordinate mapping between the non-reference and the reference genomes, either in a chains format or in the following block like format:
#CHR	REF	NONREF
chr1	1	1
chr1	3003641	0
chr1	3003645	3003641
chr1	3003650	0
chr1	3003654	3003646
chr1	3006791	0
chr1	3006793	3006783
chr1	3006835	0
chr1	3006836	3006825
chr1	0	3007262
chr1	3007273	0
chr1	3007275	3007263
chr1	0	3008478
chr1	3008490	3008481

Where the lines containing 0's indicate deletions (either in the ref or in the nonref).

I would like to lift-over my bam from the non-reference to the reference which basically means lifting over the "pos" and "cigar" fields. Using the functions of GenomicRanges it is straight forward to do it for the "pos" field but it gets complicated for the "cigar" field. Any idea how can this be done efficiently?

 -- output of sessionInfo(): 

R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.10    Rsamtools_1.10.2     Biostrings_2.26.3    GenomicRanges_1.10.7 IRanges_1.16.6       BiocGenerics_0.4.0  

loaded via a namespace (and not attached):
[1] bitops_1.0-5    parallel_2.15.2 stats4_2.15.2   tools_2.15.2    zlibbioc_1.4.0 


--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list