[BioC] How to convert from IRanges(List) to Rle(List)

Nicolas Delhomme delhomme at embl.de
Sun Apr 8 14:17:44 CEST 2012


Hi Michael,

That sounds really good!

When you talk about refactoring the  transcriptLocsToRefLocs function, what do you mean exactly? I didn't find the interface so hard to understand, took me ~5 mins to figure it out. Some error message could be more explicit though, e.g. I got the following when tlocs was a list of numeric vectors instead of a list of integer vectors:

Error in .Call2("tlocs2rlocs", tlocs, exonStarts, exonEnds, strand, decreasing.rank.on.minus.strand,  : 
  'tlocs' has invalid elements

but that was all really.

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On 8 Apr 2012, at 07:45, Michael Lawrence wrote:

> On Sat, Apr 7, 2012 at 7:31 PM, Valerie Obenchain <vobencha at fhcrc.org>wrote:
> 
>> On 04/07/12 16:30, Michael Lawrence wrote:
>> 
>>> On Sat, Apr 7, 2012 at 11:12 AM, Martin Morgan<mtmorgan at fhcrc.org>
>>> wrote:
>>> 
>>> On 04/07/2012 05:39 AM, Nicolas Delhomme wrote:
>>>> 
>>>> Hi all,
>>>>> 
>>>>> I'm just wondering if there would be a direct way to convert an
>>>>> IRanges to an Rle, as in: as(rng,"Rle"). At the moment, I can convert
>>>>> my IRanges into an integer vector and cast that as an Rle
>>>>> (Rle(as.integer(rng)), but that is not extremely efficient on a long
>>>>> IRangesList (with>   700,000 IRanges in it). Takes ~10 mins with an
>>>>> sapply.
>>>>> 
>>>>> Why I want that is for the following: I have an IRangesList of
>>>>> transcripts (describing exons at the genome level) and for every one,
>>>>> I have a bp position at the transcript level that I want to convert
>>>>> into a genomic bp position. Basically, I need to be able to convert a
>>>>> given transcript coordinate into the corresponding genomic
>>>>> coordinate. My IRanges contain the genomic coordinates of every
>>>>> transcript and by converting it into an integer vector, I can select
>>>>> the right genomic bp coordinate by using the transcript bp coordinate
>>>>> as an index (as.integer(rng)[transcript.****pos]).
>>>>> 
>>>>> 
>>>>> I considered the IRanges approach because I keep the transcript name
>>>>> and I'm sure that I looking up the right coord in the right
>>>>> transcript, but I'm open to other suggestions.
>>>>> 
>>>>> Hi Nico -- VariantAnnotation::****refLocsToLocalLocs,
>>>> GenomicFeatures::****transcriptLocs2refLocs
>>>> 
>>>> and IRanges::map might do this for you; no direct experience on my part,
>>>> though. Martin
>>>> 
>>>> 
>>>> Right. Right now, IRanges::map will take things from global to local
>>> (either into transcripts or reads, depending on the argument). This takes
>>> the place of "refLocsToLocalLocs". What "map" needs to support is the
>>> reverse. I think we could do this with either a new function. I am not
>>> sure
>>> if it should be called reverseMap though, because it's not clear which is
>>> forward and which is reverse. Maybe we need mapToGlobal and mapToLocal? Or
>>> maybe "absolute" and "relative" are better terms?
>>> 
>>> Btw, we are working on an "easier to use" interface for the
>>> transcriptLocsToRefLocs function and that should be integrated with any
>>> refactoring/renaming.
>>> 
>> I like the idea of the map generic and where it is going. I think the
>> mapToGlobal and mapToLocal terms are more clear. Assuming in mapToGlobal
>> the 'from' would be along the lines of cDNA-based, cds-based, or
>> protein-based coordinates. In mapToLocal the 'from' would always be
>> genomic-based coordinates. Yes?
>> 
>> 
> Yes, that would be the typical use case, although the generic is meant to
> be more general, i.e., it is in IRanges, not GenomicRanges.
> 
> 
>> Valerie
>> 
>> 
>>> Let's get a discussion going.
>>> 
>>> Michael
>>> 
>>> 
>>> Thanks for any pointers,
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Nico
>>>>> 
>>>>> ------------------------------****----------------------------**--**---
>>>>> 
>>>>> Nicolas Delhomme
>>>>> 
>>>>> Genome Biology Computational Support
>>>>> 
>>>>> European Molecular Biology Laboratory
>>>>> 
>>>>> Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de
>>>>> Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany
>>>>> 
>>>>> ______________________________****_________________ Bioconductor
>>>>> mailing
>>>>> list Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/****listinfo/bioconductor<https://stat.ethz.ch/mailman/**listinfo/bioconductor>
>>>>> <https:/**/stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>>Search
>>>>> the
>>>>> archives:
>>>>> http://news.gmane.org/gmane.****science.biology.informatics.****
>>>>> conductor<http://news.gmane.org/gmane.**science.biology.informatics.**conductor>
>>>>> <http://news.gmane.**org/gmane.science.biology.**informatics.conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>> 
>>>>> 
>>>>> 
>>>> --
>>>> Computational Biology
>>>> Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>> 
>>>> Location: M1-B861
>>>> Telephone: 206 667-2793
>>>> 
>>>> 
>>>> ______________________________****_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/****listinfo/bioconductor<https://stat.ethz.ch/mailman/**listinfo/bioconductor>
>>>> <https:/**/stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>> 
>>>> Search the archives: http://news.gmane.org/gmane.**
>>>> science.biology.informatics.****conductor<http://news.gmane.**
>>>> org/gmane.science.biology.**informatics.conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>> 
>>>> 
>>>>        [[alternative HTML version deleted]]
>>> 
>>> 
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>> 
>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list