[BioC] stranded intronic variants with VariantAnnotation::locateVariants()

Robert Castelo robert.castelo at upf.edu
Wed Nov 6 09:03:59 CET 2013


Wonderful!! thanks a lot!!

robert.

On 11/6/13 1:15 AM, Valerie Obenchain wrote:
> This is implemented in v 1.9.7. locateVariants() now returns the 
> strand of the subject that was hit except for IntergenicVariants().
>
> The intergenic case returns multiple precede and follow gene id's. 
> When 'ignore.strand=TRUE' genes on both strands are searched and the 
> result can be a mixture of '+' and '-'. For this case the strand 
> returned is '*'. When 'ignore.strand=FALSE' only genes on the same 
> strand as the 'query' are searched so the return strand matches the 
> query.
>
> Valerie
>
>
>
> On 10/18/2013 02:41 PM, Robert Castelo wrote:
>> Great! thanks a lot Valerie!!
>>
>> robert.
>>
>> On 10/18/13 10:19 PM, Valerie Obenchain wrote:
>>> Hi Robert,
>>>
>>> Yes, I can add that. I'll let you know when it's done.
>>>
>>> Valerie
>>>
>>> On 10/17/2013 04:01 AM, Robert Castelo wrote:
>>>> hi,
>>>>
>>>> i have the following feature request for the VariantAnnotation 
>>>> package.
>>>>
>>>> currently, the function predictCoding() annotates the strand of 
>>>> variants
>>>> within exons according to a given gene annotation. would it be 
>>>> possible
>>>> that the function locateVariants() in the VariantAnnotation package
>>>> annotates the strand for intronic variants?
>>>>
>>>> introns are non-coding, and therefore, not annotated with
>>>> predictCoding(), but are stranded (GT-AG).
>>>>
>>>> here goes a code snippet that illustrates what i'm talking about
>>>> (adapted from the vignette):
>>>>
>>>> =================
>>>> library(VariantAnnotation)
>>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>>>>
>>>> fl <- system.file("extdata", "chr22.vcf.gz",
>>>> package="VariantAnnotation")
>>>> vcf <- readVcf(fl, "hg19")
>>>> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>>>> seqlevels(vcf) <- "chr22"
>>>> rd <- rowData(vcf)
>>>> loc <- locateVariants(rd, txdb, IntronVariants())
>>>>
>>>> head(loc, n=3)
>>>> GRanges with 3 ranges and 7 metadata columns:
>>>>        seqnames               ranges strand | LOCATION QUERYID
>>>> TXID     CDSID      GENEID
>>>>           <Rle>            <IRanges> <Rle> | <factor> <integer>
>>>> <integer> <integer> <character>
>>>>    [1]    chr22 [50300078, 50300078]      * | intron         1
>>>> 75253      <NA>       79087
>>>>    [2]    chr22 [50300086, 50300086]      * | intron         2
>>>> 75253      <NA>       79087
>>>>    [3]    chr22 [50300101, 50300101]      * | intron         3
>>>> 75253      <NA>       79087
>>>>              PRECEDEID        FOLLOWID
>>>>        <CharacterList> <CharacterList>
>>>>    [1]
>>>>    [2]
>>>>    [3]
>>>>    ---
>>>>    seqlengths:
>>>>     chr22
>>>>        NA
>>>> =================
>>>>
>>>> i.e., the strand column is set to * for the intronic variants. it's ok
>>>> if this new feature would be added to the devel version, as happens
>>>> normally with new features.
>>>>
>>>>
>>>> thanks!
>>>> robert.
>>>> ps: sessionInfo()
>>>> R version 3.0.2 (2013-09-25)
>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] parallel  stats     graphics  grDevices utils datasets methods
>>>> [8] base
>>>>
>>>> other attached packages:
>>>>   [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
>>>>   [2] GenomicFeatures_1.14.0
>>>>   [3] AnnotationDbi_1.24.0
>>>>   [4] Biobase_2.22.0
>>>>   [5] VariantAnnotation_1.8.0
>>>>   [6] Rsamtools_1.14.1
>>>>   [7] Biostrings_2.30.0
>>>>   [8] GenomicRanges_1.14.1
>>>>   [9] XVector_0.2.0
>>>> [10] IRanges_1.20.0
>>>> [11] BiocGenerics_0.8.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>>   [1] biomaRt_2.18.0     bitops_1.0-6       BSgenome_1.30.0 DBI_0.2-7
>>>>   [5] RCurl_1.95-4.1     RSQLite_0.11.4 rtracklayer_1.22.0
>>>> stats4_3.0.2
>>>>   [9] tools_3.0.2        XML_3.95-0.2       zlibbioc_1.8.0
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>



More information about the Bioconductor mailing list