[BioC] Introns in exons and vice versa (Genomic Features)

Sean Davis sdavis2 at mail.nih.gov
Wed Oct 3 13:12:26 CEST 2012


On Wed, Oct 3, 2012 at 6:50 AM, Fenton Christopher Graham
<christopher.fenton at uit.no> wrote:
> library(GenomicFeatures)
>
> #get known genes
> genes <- makeTranscriptDbFromUCSC(genome = "hg19",tablename = "knownGene")
> saveFeatures(genes, file="hg19_knownGene.sqlite")
> txdb    <- loadFeatures("hg19_knownGene.sqlite")
>
> #get introns
> introns <- intronsByTranscript(txdb)
> introns <- unlist(introns)
>
> #get exons
> exons    <- exonsBy(txdb)
> exons    <- unlist(exons)
>
> #take chr1 and "+" stand
> introns_ind  <- which(seqnames(introns) == "chr1" & strand(introns) == "+")
> exons_ind   <- which(seqnames(exons) == "chr1" & stand(exons) == "+")
>
> #convert to ranges
> introns_chr1 <- ranges(introns[introns_ind,])
> exons_chr1   <- ranges(exons[exons_ind,])
>
> #look for introns in exons?
> sum(countOverlaps(introns_chr1, exons_chr1, type="within"))
>
> Why isn't this number 0?
> Why are we looking at over 1000 introns within exons?

The genome is a strange place, isn't it?  To see what is going on,
extend your example just a bit:

intronsWithinExons_ind =
queryHits(findOverlaps(introns_chr1,exons_chr1,type='within'))
intronsWithinExons = introns_chr1[intronsWithinExons_ind,]

Try looking at a few of these in the UCSC genome browser.  The few I
checked are there.

Sean



More information about the Bioconductor mailing list