[BioC] chromosome position --> gene

Steffen Durinck sdurinck at ebi.ac.uk
Mon Jun 27 10:42:13 CEST 2005


Hi Marco,

The biomaRt package has been updated so it can perform your query.   The 
update (v1.1.3) will be available through the developmental packages on 
the Bioconductor website probably by tomorrow.   If you want it sooner, 
I could send it to you but then you'll have to give me the platform you 
work on.  biomaRt has been used on Windows and Linux and it might also 
work on OSX if you get RMySQL working.
You can query for Affy, RefSeq, Entrez-Gene, HUGO and Ensembl id's.

Here some examples:

 >mart<-martConnect()
connected to:  ensembl_mart_31

 >#get all affy features of array hgu95av2 that match sequences on 
chromosome 2
 >getFeature(chromosome = 2, array="hg_u95av2",mart=mart)
Object of class 'martTable' with 749 IDs. The first 5 rows are:
  object at id[1:n] chromosome   start     end
1       33895_at          2  208151  254745
2       36611_at          2  254896  268280
3       35928_at          2 1396242 1525502
4       39327_at          2 1605957 1718575
5       32712_at          2 1763190 2305323

 >getFeature(chromosome = 2, start=200000, end=300000, 
array="hg_u95av2",mart=mart)

Object of class 'martTable' with 2 IDs.
  object at id[1:n] chromosome  start    end
1       33895_at          2 208151 254745
2       36611_at          2 254896 268280

 ># query for Ensembl identifiers
 >getFeature(chromosome = 2, start=200000, end=300000, type="ensembl", 
species="hsapiens",mart=mart)

Object of class 'martTable' with 3 IDs.
   object at id[1:n] chromosome  start    end
1 ENSG00000035115          2 208151 254745
2 ENSG00000143727          2 254896 268280
3 ENSG00000189292          2 269576 278182

 >#query for refseq identifiers
 >getFeature(chromosome = 2, start=200000, end=300000, type="refseq", 
species="hsapiens",mart=mart)
Object of class 'martTable' with 4 IDs.
  object at id[1:n] chromosome  start    end
1      NM_004300          2 254896 268280
2      NM_007099          2 254896 268280
3      NM_177554          2 254896 268280
4   NM_001002919          2 269576 278182


Cheers,
Steffen

Sean Davis wrote:

>Marco,
>
>I don't think there is a direct function for doing this, but I could be 
>wrong (and would love to be corrected).  However, I can think of at least 
>three options:
>
>1)  Output your chromosomes and locations in a format that can be used in 
>the UCSC genome browser to construct a custom track, which is actually quite 
>simple.  Then, you can "intersect" your custom track with other things 
>available from the genome browser including, but not limited to, genes.
>
>2)  You can download the table from the UCSC genome browser that you are 
>interested in (or get it via the table browser).  Then, you can use that 
>table to construct your own function.
>
>3)  You could use RMySQL to directly query the EnsEMBL database to get the 
>information you want.  This would require a bit of understanding of the 
>EnsEMBL database and of SQL queries.  (See the biomaRt package for some 
>examples....)
>
>Number 1 has two advantages:  you can intersect with with multiple 
>annotations and it is probably less work.  Number 2 has the advantage that 
>it can be done entirely from R (including the download).  If you are EnsEMBL 
>centric and comfortable with SQL, number 3 may do it for you.
>
>Sean
>
>----- Original Message ----- 
>From: "Sorani, Marco" <soranim at pharmacy.ucsf.edu>
>To: <bioconductor at stat.math.ethz.ch>
>Sent: Sunday, June 26, 2005 1:19 PM
>Subject: [BioC] chromosome position --> gene
>
>
>  
>
>>Is there R code available that maps a chromosomal position (e.g., "chr2", 
>>"80,000,000") or range (e.g., "chr2", "80,000,000 - 90,000,000") to a 
>>gene?
>>
>>******************
>>soranim at pharmacy.ucsf.edu
>>Marco Sorani
>>Program in Biological & Medical Informatics
>>University of California, San Francisco
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>>    
>>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>  
>



More information about the Bioconductor mailing list