[BioC] GRanges - reduce() function

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Fri Nov 18 15:56:26 CET 2011


On Fri, Nov 18, 2011 at 9:12 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 11/17/2011 05:57 PM, Jason Ross wrote:
>>
>> Hi Fahim,
>>
>> I am also frustrated by this. The meta-data also vanishes when using
>> findOverlaps(). I'm thinking of writing some wrapper functions to place
>> the
>> meta-data back into the Granges object.
>
> Hi Jason et al.,
>
> The problem in 'reduce' is that the elementMetadata columns need to be
> 'reduce'd too, and there is no universal way to do that -- for 'transcripts'
> in Fahim's example, maybe it's just collapsing entries into a CharacterList,
> whereas for "Gene" it's split-by-reduced-range and 'unique'. For numeric
> values one might sum or mean or max or ....
>
> Can you be more specific about findOverlaps? It's not really clear which
> data you'd like to have propagated.
>
> For Fahim's question, I arrived at
>
> values(r)[["Gene"]] <-
>    tapply(values(gr)[["Gene"]], match(gr, r), unique)
>
> which I think is quite robust, but I'd recommend checking carefully on
> complicated data.

When I am faced with distangling a reduce() call I always use findOverlaps, like
  grRed = reduce(gr)
  findOverlaps(gr, grRed)
This may seem weird (kind of doing the computation twice), but in my
experience it is quite fast.

Kasper



More information about the Bioconductor mailing list