[BioC] How to get position for each gene ID/gene symbol instead of position for each transcript

shirley zhang shirley0818 at gmail.com
Wed Aug 25 05:00:31 CEST 2010


Thanks Steve. It is exactly what I want.

Good night!
Shirley

On Tue, Aug 24, 2010 at 10:41 PM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Sorry:
>
>> You can do this pretty "simply" with GenomicFeatures, if you want to
>> stick with that:
>>
>> R> txdb <- loadFeatures('your.transcript.db')
>> R> xcripts <- transcriptsBy(txdb, by='gene')
>>
>> ## This part is really slow -- this will be subject of next email
>> R> gene.bounds <- seqapply(xcripts, reduce)
>
> Should have used `range` instead of `reduce` here:
>
> R> gene.bounds <- seqapply(xcripts, range)
>
> The rest is the same ...
>
>> the names() of gene.bounds is the entrez.id of the gene. You can use
>> the org.Hs.eg.db pacakges
>>
>> R> library(org.Hs.eg.db)
>> R> symbols <- mget(names(gene.bounds), org.Hs.egSYMBOL, ifnotfound=NA)
>>
>> symbols will now be a list (names are entrez ids, values are the gene
>> symbols) that you can manipulate in "the standard R way"
>>
>> Hope that helps,
>>
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>



-- 
Xiaoling (Shirley) Zhang

M.D., Ph.D. (Bioinformatics)
Boston University, Boston, MA
Tel: (857) 233-9862
Email: zhangxl at bu.edu



More information about the Bioconductor mailing list