[BioC] Extract junction reads from bam

Mon Aug 18 23:20:49 CEST 2014

If you've already loaded some alignments in your "aln" object, then you can
extract the aligned regions (the exonic parts) with:

exonic <- grglist(aln)

Then, you want to see which read has an exonic region starting at position
X (i.e., X is a 3' splice site), with the extra constraint that there must
be at least one exonic region preceding (so we know there is a splice).

trueForReadsSplicingToX <- any(which(start(exonic) == X) > 1)

That may or may not be fast, but it is simple. Didn't test it.

Michael

On Mon, Aug 18, 2014 at 1:32 PM, Helen Zhou <zhou.helen at yahoo.com> wrote:

> Dear Sir/Madam,
>
> I have a bam file with reads from a paired-end RNA-seq experiments, and I
> would like to extract all reads that span a particular intron/exon
> junction. For example where part of a read maps to exon ABC starting at chr
> 1 position 66909348. I do not know where the first part of the read maps;
> there are multiple possible exons being spliced to exon ABC.
>
> I can extract all reads mapping to and spanning for example 50nt around
> the junction, saying:
> library(GenomicRanges)
> loc <-RangesList('1'=IRanges(start=6690298,end=6690398))
> parameter <- ScanBamParam(which=loc, what=extract, simpleCigar=FALSE,
> reverseComplement=FALSE)
> aln <- readGAlignmentsFromBam("test.bam", param=parameter)
>
> However, this includes all the reads that span across this region because
> the ends of the read map outside the area indicated in 'loc'. How can I get
> only the reads where (part of) the read starts mapping at the exact
> intron/exon location at position 66909348?
>
> Thanks you
> H Zhou
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

	[[alternative HTML version deleted]]