[BioC] BiomaRt - Flanking regions

James W. MacDonald jmacdon at uw.edu
Tue Jul 17 14:59:06 CEST 2012


Hi Avoks,

On 7/16/2012 10:57 PM, Ovokeraye Achinike-Oduaran wrote:
> Thanks Jim.
>
> I will try out the ?getSequence command and see what options I get.
> What I am actually trying to do is to retreive possible regulatory
> SNPs 1Kb upstream of the genes on my list.

In that case do you really want the sequences? Are these known SNPs 
(e.g., in dbSNP) or are they novel ones that you have detected with some 
sequencing? Trying to eyeball 1Kb of sequence data seems pretty 
daunting, as well as not a good use of time.

If I assume that you have a set of genes that you think have SNPs in 
upstream regulatory regions, and you just want a list of known SNPs in 
each region, then I think this is a job for one of the 
SNPlocs.Hsapiens.dbSNP packages, along with GenomicFeatures and 
GenomicRanges packages.

The general idea would be to create a GRangesList of all the SNPs from 
the SNPlocs package, and then create a GRangesList for all the flanking 
regions for which you have an interest. You could then use 
findOverlaps() to get the SNPs that are found in the flanking regions.

Best,

Jim
>
> Thanks again. I'll see what that gives me.
>
> Kind regards,
>
> Avoks
>
> On 7/16/12, James W. MacDonald<jmacdon at uw.edu>  wrote:
>> Hi Avoks,
>>
>> It depends on what you mean by 'flanking gene regions'. Do you want the
>> sequences? The coordinates?
>>
>> Assuming you want the sequences:
>>
>> library(biomaRt)
>> ?getSequence
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On 7/16/2012 9:53 AM, Ovokeraye Achinike-Oduaran wrote:
>>> Hi all,
>>>
>>> I had a quick question on how to get flanking gene regions using
>>> biomaRt to access Biomart. I used listAttributes() to see what my
>>> options were but I could not quite figure it out. Any ideas?
>>>
>>> Thanks and kind regards.
>>>
>>> -Avoks
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list