[BioC] stranded intronic variants with VariantAnnotation::locateVariants()

Valerie Obenchain vobencha at fhcrc.org
Wed Nov 6 01:15:23 CET 2013


This is implemented in v 1.9.7. locateVariants() now returns the strand 
of the subject that was hit except for IntergenicVariants().

The intergenic case returns multiple precede and follow gene id's. When 
'ignore.strand=TRUE' genes on both strands are searched and the result 
can be a mixture of '+' and '-'. For this case the strand returned is 
'*'. When 'ignore.strand=FALSE' only genes on the same strand as the 
'query' are searched so the return strand matches the query.

Valerie



On 10/18/2013 02:41 PM, Robert Castelo wrote:
> Great! thanks a lot Valerie!!
>
> robert.
>
> On 10/18/13 10:19 PM, Valerie Obenchain wrote:
>> Hi Robert,
>>
>> Yes, I can add that. I'll let you know when it's done.
>>
>> Valerie
>>
>> On 10/17/2013 04:01 AM, Robert Castelo wrote:
>>> hi,
>>>
>>> i have the following feature request for the VariantAnnotation package.
>>>
>>> currently, the function predictCoding() annotates the strand of variants
>>> within exons according to a given gene annotation. would it be possible
>>> that the function locateVariants() in the VariantAnnotation package
>>> annotates the strand for intronic variants?
>>>
>>> introns are non-coding, and therefore, not annotated with
>>> predictCoding(), but are stranded (GT-AG).
>>>
>>> here goes a code snippet that illustrates what i'm talking about
>>> (adapted from the vignette):
>>>
>>> =================
>>> library(VariantAnnotation)
>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>>>
>>> fl <- system.file("extdata", "chr22.vcf.gz",
>>> package="VariantAnnotation")
>>> vcf <- readVcf(fl, "hg19")
>>> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
>>> seqlevels(vcf) <- "chr22"
>>> rd <- rowData(vcf)
>>> loc <- locateVariants(rd, txdb, IntronVariants())
>>>
>>> head(loc, n=3)
>>> GRanges with 3 ranges and 7 metadata columns:
>>>        seqnames               ranges strand | LOCATION   QUERYID
>>> TXID     CDSID      GENEID
>>>           <Rle>            <IRanges>  <Rle> | <factor> <integer>
>>> <integer> <integer> <character>
>>>    [1]    chr22 [50300078, 50300078]      * |   intron         1
>>> 75253      <NA>       79087
>>>    [2]    chr22 [50300086, 50300086]      * |   intron         2
>>> 75253      <NA>       79087
>>>    [3]    chr22 [50300101, 50300101]      * |   intron         3
>>> 75253      <NA>       79087
>>>              PRECEDEID        FOLLOWID
>>>        <CharacterList> <CharacterList>
>>>    [1]
>>>    [2]
>>>    [3]
>>>    ---
>>>    seqlengths:
>>>     chr22
>>>        NA
>>> =================
>>>
>>> i.e., the strand column is set to * for the intronic variants. it's ok
>>> if this new feature would be added to the devel version, as happens
>>> normally with new features.
>>>
>>>
>>> thanks!
>>> robert.
>>> ps: sessionInfo()
>>> R version 3.0.2 (2013-09-25)
>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>>> [8] base
>>>
>>> other attached packages:
>>>   [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
>>>   [2] GenomicFeatures_1.14.0
>>>   [3] AnnotationDbi_1.24.0
>>>   [4] Biobase_2.22.0
>>>   [5] VariantAnnotation_1.8.0
>>>   [6] Rsamtools_1.14.1
>>>   [7] Biostrings_2.30.0
>>>   [8] GenomicRanges_1.14.1
>>>   [9] XVector_0.2.0
>>> [10] IRanges_1.20.0
>>> [11] BiocGenerics_0.8.0
>>>
>>> loaded via a namespace (and not attached):
>>>   [1] biomaRt_2.18.0     bitops_1.0-6       BSgenome_1.30.0 DBI_0.2-7
>>>   [5] RCurl_1.95-4.1     RSQLite_0.11.4     rtracklayer_1.22.0
>>> stats4_3.0.2
>>>   [9] tools_3.0.2        XML_3.95-0.2       zlibbioc_1.8.0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



More information about the Bioconductor mailing list