[BioC] Find Affy probes within a particular region

James W. MacDonald jmacdon at med.umich.edu
Tue Jun 17 18:35:31 CEST 2008


Hi Daniel,

Daniel Brewer wrote:
> Hi,
> 
> I was wondering what the best way to find which Affymetrix probes are
> within a specific genomic region (chromosome, start, stop).  I am not
> sure if Biomart nor the annotation.db can do this as they both go to
> some common ID first.  The annotation.db stuff seems to only have one
> position information too.  The other option is to dwonload the
> annotation file from Affymetrix and load that in, but I would prefer to
> avoid that if at all possible.

So it depends on exactly what you are trying to do here. The annotation 
package will give you the chromosome and start location for the 
probeset. You can use biomaRt to get the chromosome, start, stop for the 
probeset as well:

 > getBM(c("chromosome_name", "start_position","end_position"), 
"affy_hg_u133_plus_2", "1007_s_at", mart)
   chromosome_name start_position end_position
1               6       30956784     30975912

Of course the question posed is to find things within a specific region, 
not find the region given the probeset ID. If this is what you want, 
then an annotation.db package might be what you want. Something like

library(mouse4302.db)
tab <- toTable(mouse4302CHRLOC)
ind <- tab[,3] == 6 & abs(tab[,2]) <= 53043660 &
	abs(tab[,2]) >= 50000000
tab[ind,]

 > head(tab[ind,])
          probe_id start_location Chromosome
1481   1416884_at       51420614          6
1482   1432628_at       51420614          6
1483 1439421_x_at       51420614          6
1484 1448504_a_at       51420614          6
1485   1454305_at       51420614          6
2397 1422483_a_at      -50512561          6

If you then want the probeset start, end, you could feed the Affy IDs to 
biomaRt.

But you specifically said probes, so you might actually want probes 
instead of probesets. If you want to get that specific then you could 
use the Biostrings package along with one of the genome packages 
(warning - HUGE download) and a probe package to get the locations of 
these probes. I'm not going to give code for that. Instead see the 
vignette in the BSgenome package, which has lots of examples.

Best,

Jim


> 
> Has anyone got any ideas.
> 
> Many thanks
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list