[BioC] VariantAnnotation: fine define Locating variants in and around genes

Valerie Obenchain vobencha at fhcrc.org
Thu Jan 31 22:30:45 CET 2013


Hi Fabrice,

To identify snps (or any ranges) in introns only, use IntronVariants() 
as the 'region' argument. The CodingVariants are the exon regions. If 
you want all regions except coding, I would suggest using AllVariants().

This output is from the man page example. The 'loc_coding' name is 
misleading since AllVariants were use as 'region'. I have changed it to 
'loc_all' in the devel branch.

 > loc_coding <- locateVariants(vcf_adj, txdb, AllVariants())
 > loc_coding
GRanges with 16 ranges and 7 metadata columns:
              seqnames               ranges strand |   LOCATION   QUERYID
                 <Rle>            <IRanges>  <Rle> |   <factor> <integer>
                  chr1 [   13220,    13220]      * |     intron         1
                  chr1 [   13220,    13220]      * | spliceSite         1
                  chr1 [   13220,    13220]      * |     intron         1
                  chr1 [   13220,    13220]      * |     intron         1
                  chr1 [   13220,    13220]      * | spliceSite         1
...
...

This example has variants in splice sites, introns, coding and 
intergenic regions.

 > tbl <- table(loc_coding$LOCATION)
 > tbl[tbl > 0]

spliceSite     intron     coding intergenic
          2          7          2          5

The result can be subset on LOCATION for the region of interest. The 
QUERYID column maps back to the row number in the original 'query' 
argument to locateVariants().

introns <- loc_coding[loc_coding$LOCATION == "intron", ]
 > head(introns, 3)
GRanges with 3 ranges and 7 metadata columns:
    seqnames         ranges strand | LOCATION   QUERYID      TXID
       <Rle>      <IRanges>  <Rle> | <factor> <integer> <integer>
        chr1 [13220, 13220]      * |   intron         1         1
        chr1 [13220, 13220]      * |   intron         1         2
        chr1 [13220, 13220]      * |   intron         1         3


Valerie


On 01/31/2013 12:34 PM, Fabrice Tourre wrote:
> Dear list,
>
> I am using VariantAnnotation to Locate variants in and around genes.
>
> In VariantAnnotation, the region is defined as: Coding Variants,
> IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants,
> SpliceSiteVariants or PromoterVariants.
>
> If it possible to know whether a snp is in exon/intron within
> transcription region but outside coding region?
>
> Thanks.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list