[BioC] find gene symbols immediately flanking before and after a SNP position

Valerie Obenchain vobencha at fhcrc.org
Thu Feb 14 00:53:16 CET 2013


Hi Adai,

When using precede() and follow() the 'x' and 'subject' arguments can be 
any of the following combinations,

 > showMethods("precede")
Function: precede (package IRanges)
x="ANY", subject="SummarizedExperiment"
x="GenomicRanges", subject="GenomicRanges"
x="GenomicRanges", subject="missing"
x="Ranges", subject="RangesORmissing"
x="SummarizedExperiment", subject="ANY"
x="SummarizedExperiment", subject="SummarizedExperiment"


The function transcriptsBy() returns a GRangesList. Instead try using 
the transcripts() function which will return a GRanges,

     tx <- transcripts(txdb)


Another function worth exploring is locateVariants() in the 
VariantAnnotation package. See the examples on the ?locateVariants man 
page to make sure the seqnames (chromosome names) in your data and the 
txdb match. You can try using the AllVariants() region

     loc <- locateVariants(query, subject, AllVariants())

or IntergenicVariants() if you are sure the snp is intergenic.

     loc <- locateVariants(query, subject, IntergenicVariants())

In these examples, 'query' is a GRanges of your data and 'subject' is 
the txdb you made from UCSC.

Valerie





On 02/13/2013 02:38 PM, Adaikalavan Ramasamy wrote:
> Dear all,
>
> I have a list of several hundred SNP that I would like to annotate
> functionally and am able to do this via websites such as SeattleSeq.
> However, for intergenic SNPs it does not give me the neighbouring
> genes. Therefore, I have tried to find genes immediately flanking a
> SNP (one left and right) in R. I note that this question has been
> asked previously. I am trying to follow one of the previous
> suggestions (https://stat.ethz.ch/pipermail/bioconductor/2010-December/037185.html).
> I been struggling with this for the last two days but I think I am
> getting something fundamentally wrong.
>
> I have chosen the following two SNPs (among several thousands). I am
> expecting to see the following kind of output:
>     rs881375    (chr9:123652898)  is located between PHF19 and TRAF1
>     rs12191877 (chr6:31252925)   is located between RPL3P2 and WASF5P
>
>
> First, I code the query up as a GRange object:
>
>     rsid <- c("rs881375", "rs12191877")
>     chr  <- c("chr9",     "chr6")
>     pos  <- c(123652898,  31252925)
>
>     library(GenomicFeatures)
>
>     target <- GRanges(
>        seqnames = Rle( chr ),
>        ranges   = IRanges(pos, end=pos, names=rsid),
>        strand   = Rle(strand( rep("*", length(chr)) ))
>     )
>
>     # GRanges with 2 ranges and 0 metadata columns:
>     #              seqnames                 ranges strand
>     #                <Rle>              <IRanges>  <Rle>
>     #        rs881375     chr9 [123652898, 123652898]      *
>     #    rs12191877     chr6 [ 31252925,  31252925]      *
>     # ---
>     #     seqlengths:
>     #   chr6 chr9
>     #       NA   NA
>
>     txdb <- makeTranscriptDbFromUCSC("hg19")   # about 5 min
>     tx  <- transcriptsBy(txdb)
>
>
> But when I try
>     precede( target, tx )
>     follow( target, tx )
>
> I get the following message:
>     Error in function (classes, fdef, mtable)  :
>       unable to find an inherited method for function ‘precede’ for
> signature ‘"GRanges", "GRangesList"’
>
> Any help would be very much appreciated. I am happy to try other
> packages or websites if available. Many thanks.
>
> Regards, Adai
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list