[BioC] How to retrieve exon ID and gene ID from exon coordinates?

James W. MacDonald jmacdon at uw.edu
Mon Sep 10 23:29:24 CEST 2012


Hi Ying,

On 9/10/2012 4:54 PM, ying chen wrote:
>
>
> Hi guys, I have a RNASeq data table which has exon cooridinates (chrom, start. end) and raw count. I want to use DEXseq to see differential transcripts. To do it I need to get geneIDs and exonIDs from corresponding exon cooridinates. Any suggestion how to do it? Thanks a lot for the help!

You don't give much to go on. Assuming you are working with a common 
species, it is simple. Let's assume you are working with mice.

Something like this should work:

yourdata <- read.table("yourdata.txt", stringsAsFactors=FALSE)
library(TxDb.Mmusculus.UCSC.mm9.knownGene)
ex <- exons(TxDb.Mmusculus.UCSC.mm9.knownGene, columns = 
c("exon_id","gene_id"))
yourdata <- GRanges(yourdata$chrom, IRanges(start=yourdata$start, 
end=yourdata$end))
elementMetadata(yourdata) <- elementMetadata(ex)[match(yourdata, ex),]

If you are planning on doing this sort of stuff, do yourself a favor and 
read the GenomicFeatures and GenomicRanges vignettes. They are chock 
full of info that you will need.

Best,

Jim



> Ying 		 	   		
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list