[BioC] intragenic sequence extraction

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Dec 5 02:02:14 CET 2011


Hi,

On Sun, Dec 4, 2011 at 4:16 PM, Yating Cheng <yating.cheng at charite.de> wrote:
> Hi Steve
>
> Thank you for your answer. Then I ran this script, but there is error in
> the end. And I don't know how to solve it.
>
> library(Biostrings)
> library(GenomicRanges)
>
>
> library(BSgenome.Amellifera.UCSC.apiMel2)
> bee.txdb<- makeTranscriptDbFromUCSC(genome="apiMel2",
>  tablename="ensGene")
>  tx <- transcripts(bee.txdb, columns=c("tx_id", "tx_name", "gene_id"))
>
>
> intragenic_seq<-getSeq(Amellifera,gaps(tx))
>
> #####Error in .starfreeStrand(strand(names)) :
>  cannot mix "*" with other strand values

When you call `gaps()` you get ranges (usually as long as your
chromosome) with strands assigned as '*' -- you have to filter those
out.

Running with your example, you'd do:

R> intergenic <- gaps(tx)
R> intergenic <- intergenic[strand(intergenic) != '*']
R> intergenic_seq <- getSeq(Amellifera, intergenic)

On a related note -- I think I find that behavior from gaps (returning
full chromosome-length ranges assigned to the '*' strand) more
annoying than useful ... especially when none of my ranges defined in
the thing I am calling gaps on has ranges assigned to `*` ... but a
tale for a different thread, perhaps ...

HTH,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list