[BioC] more stupid *Ranges questions...

Sean Davis sdavis2 at mail.nih.gov
Tue Sep 20 12:31:01 CEST 2011


Hi, Tim.

If you really need Entrez Gene annotation, I'd suggest using a
transcript db derived from refseq and not from UCSC known genes.
Otherwise, the workflow will be the same, I believe.

Sean


2011/9/20 Tim Triche, Jr. <tim.triche at gmail.com>:
> Hi Herve (and thank you),
>
> Is there an idiomatic approach that will get met the nearest annotated TSS
> having an Entrez gene_id?   Something along the lines of
>
> nearest( cpgsites, txdb[ which(!is.na(elementMetadata(txdb)$gene_id)) ] )
>
> Something like that, but which gives me the desired subset of transcripts
> (right now I can't get it).
>
> I guess the deal with non-coding RNAs is that I should just use the closest
> transcript (period) but this (walk-upstream-and-get-the-nearest-EG-ID) seems
> like the sort of problem that you or someone else must have solved years
> ago.  I'd love to take advantage of that if it's the case.
>
> Thanks yet again,
>
> --t
>
>
> 2011/9/19 Hervé Pagès <hpages at fhcrc.org>
>
>> Hi Tim,
>>
>> See this email from the UCSC people for why not all UCSC Genes IDs
>> are mapped to an Entrez Gene ID:
>>
>>  https://lists.soe.ucsc.edu/**pipermail/genome/2011-April/**025784.html<https://lists.soe.ucsc.edu/pipermail/genome/2011-April/025784.html>
>>
>> HTH,
>> H.
>>
>>
>>
>> On 11-09-18 10:52 AM, Tim Triche, Jr. wrote:
>>
>>> Yeah, I realized that on Friday and forgot to post it. One question,
>>> though -- some of the transcripts aren't annotated to a gene (I would also
>>> be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I
>>> could keep them separate -- is there something like a "knownNcRna" track or
>>> table outside of UCSC that I should look into for this purpose?).
>>>
>>> Should I just throw out all the transcripts without an EntrezGene ID for
>>> the time being, then circle back and revisit this when I find the
>>> appropriate resource for non-coding but annotated transcripts?
>>>
>>> It seems odd that a table of KnownGene transcripts would lack gene IDs for
>>> some of the transcripts.
>>>
>>> Thanks again for a very useful package,
>>>
>>> --t
>>>
>>> On Sep 18, 2011, at 10:33 AM, Michael Lawrence<lawrence.michael@**
>>> gene.com <lawrence.michael at gene.com>>  wrote:
>>>
>>>
>>>>
>>>> On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr.<ttriche at usc.edu>
>>>>  wrote:
>>>> OK, so I took your advice and used
>>>>
>>>> transcripts(TxDb.Hsapiens.**UCSC.hg19.knownGene::Hsapiens_**
>>>> UCSC_hg19_knownGene_TxDb)
>>>>
>>>> and indeed that is quite handy (got all my TSSes forward and reverse for
>>>> all my probes in seconds, yay!).  Now the question is, how do I use the
>>>> associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says...
>>>>
>>>>  elementMetadata(foo)$tx_name[**1]
>>>>>
>>>> [1] "uc001aaa.3"
>>>>
>>>>> org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[**1] ]]
>>>>>
>>>> NULL
>>>>
>>>> Two steps forward, one step back.... eventually I will cram all of this
>>>> into a genoset, though... :-)
>>>>
>>>>
>>>> You do not need to use the org.* packages for this. Just ask
>>>> transcripts() for the gene_id column. See ?transcripts.
>>>>
>>>> Michael
>>>>
>>>> thanks!
>>>>
>>>> --t
>>>>
>>>>
>>>>
>>>> On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence<lawrence.michael@**
>>>> gene.com <lawrence.michael at gene.com>>  wrote:
>>>> Easiest path is to convert the RangedData to a GRanges:
>>>>
>>>> as(TSS.human.GRCh37, "GRanges")
>>>>
>>>> I might recommend though to get the TSS's from
>>>> GenomicFeatures::transcripts.
>>>>
>>>> Michael
>>>>
>>>> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche at gmail.com>
>>>>  wrote:
>>>> I have a GenomicRanges object built from interrogated sites and a
>>>> RangedData
>>>> object of human (allegedly canonical) transcription start sites, from
>>>> Julie
>>>> Zhu's ChIPpeakAnno package.  I want to walk up and down each chromosome
>>>> and
>>>> find the nearest forward and reverse strand TSS and their distance from
>>>> each
>>>> site.  This seems like it would work:
>>>>
>>>>  nearest(cpgranges, TSS.human.GRCh37)
>>>>>
>>>>
>>>> But one of the objects isn't the right type:
>>>>
>>>> Error in function (classes, fdef, mtable)  :
>>>>  unable to find an inherited method for function "nearest", for signature
>>>> "GRanges", "RangedData"
>>>>
>>>> What's the right way to solve this problem?  I know about follow() and
>>>> precede(), but those won't work either until I solve this :-)
>>>>
>>>> thanks!
>>>>
>>>>
>>>>
>>>> --
>>>> If people do not believe that mathematics is simple, it is only because
>>>> they
>>>> do not realize how complicated life is.
>>>> John von Neumann<http://www-groups.dcs.**st-and.ac.uk/~history/**
>>>> Biographies/Von_Neumann.html<http://www-groups.dcs.st-and.ac.uk/~history/Biographies/Von_Neumann.html>
>>>> >
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________**_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>> Search the archives: http://news.gmane.org/gmane.**
>>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> When you emerge in a few years, you can ask someone what you missed, and
>>>> you'll find it can be summed up in a few minutes.
>>>>
>>>> Derek Sivers
>>>>
>>>>
>>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>>
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>
>
>
> --
> If people do not believe that mathematics is simple, it is only because they
> do not realize how complicated life is.
> John von Neumann<http://www-groups.dcs.st-and.ac.uk/~history/Biographies/Von_Neumann.html>
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list