[BioC] Subscripting GenomicRanges objects with [[ or $

Patrick Aboyoun paboyoun at fhcrc.org
Wed Sep 1 18:27:13 CEST 2010


I am not sure where the design will lead, but another aspect of  
GRanges is that it has an accompanying GRangesList class for housing  
information such as the constituent exons in a transcript. There is a  
benefit for developers and script writers to having a similar  
mechanism for extracting these metadata columns for both class types.  
For a GRangesList, the [[/$ operators pull out a GRanges object for  
the selected transcript. So even if [[ and $ methods were added for  
GRanges, there would still be an issue for GRangesList objects.


Cheers,
Patrick


Quoting Michael Lawrence <lawrence.michael at gene.com>:

> On Wed, Sep 1, 2010 at 3:07 AM, Tim Yates <tyates at picr.man.ac.uk> wrote:
>
>> Hi again,
>>
>> One of the really nice things about the RangedData object is that it could
>> be treated (in general) the same way you would treat a data.frame, so it
>> was
>> possible to write methods that handled both object types the same way.
>>
>>
> This was one of the design goals. Unfortunately, RangedData has some strange
> behavior due to its internal structure. For example, it is not possible to
> reorder rows across spaces (chromosomes). Usually, this is not a big deal,
> but it can bite you. GRanges takes a simpler, flatter approach, but it was
> designed as a set of ranges with formal treatment of spaces, strands + extra
> information, rather than as a data frame with formal treatment of spaces and
> ranges (RangedData).
>
> I have a method which currently accepts a data.frame or a RangedData object
>> which I want to extend to allowing GRanges objects as well
>>
>> Without the [[ or $ subscript operators being implemented would I need to
>> have a switch based on the class of the parameter?
>>
>> As the values(obj)[['field']] method only works for GRanges objects (for
>> RangedData, this method does not cause an error, it just returns NULL),
>
>
>
> Yes, there is an unfortunate conflict here. values() for RangedData returns
> the DataFrameList, so its names are the names of the chromosomes. I think
> you're better off adding a [[ method for GRanges objects, rather than a
> .get.column().
>
> Michael
>
>
>> I
>> guess I would need to write something like this:
>>
>> .get.column = function( obj, field ) {
>>      if( class( obj ) == 'GRanges' ) {
>>          values(obj)[[ field ]]
>>      }
>>      else {
>>          obj[[ field ]]
>>      }
>>    }
>>
>> Then, call
>>
>>  .get.column(obj,'name')
>>
>> wherever I used to simply use
>>
>>  obj[['name']]
>>
>> before introducing GenomicRanges?
>>
>> Tim
>>
>> On 27/08/2010 15:02, "Martin Morgan" <mtmorgan at fhcrc.org> wrote:
>>
>> > On 08/27/2010 03:03 AM, Tim Yates wrote:
>> >> Hi Richard,
>> >>
>> >> Ahhh..cool, yeah that works. Shame it's not a unified interface across
>> all
>> >> three datatypes though.
>> >
>> > These were intentional design decisions to reduce ambiguities in which
>> > of the components of these complex arguments subscript operations were
>> > meant to apply, in the long run making it easier to write unambiguous
>> > and easy to read code. Martin
>> >
>> >>
>> >> Thanks for pointing me in the right direction though :-)
>> >>
>> >> Tim
>> >>
>> >> On 27/08/2010 10:31, "Richard Pearson" <richard.pearson at well.ox.ac.uk>
>> >> wrote:
>> >>
>> >>> Hi Tim
>> >>>
>> >>> I think you need the values accessor method here:
>> >>>
>> >>> print( values(my.gr)[[ 'name' ]] )
>> >>>
>> >>> Cheers
>> >>>
>> >>> Richard
>> >>>
>> >>>
>> >>> Tim Yates wrote:
>> >>>> Hi all,
>> >>>>
>> >>>> I'm trying to move to using GRanges objects for storing my genomic
>> features
>> >>>> rather than IRanges objects that I use currently.
>> >>>>
>> >>>> However, I cannot seem to subscript the Genomic Ranges object to
>> extract a
>> >>>> single column from the meta-data of the object.
>> >>>>
>> >>>> Hopefully this code explains what I am trying to do, and someone can
>> point
>> >>>> me in the right direction?
>> >>>>
>> >>>> Cheers,
>> >>>>
>> >>>> Tim
>> >>>>
>> >>>>> library(GenomicRanges)
>> >>>> Loading required package: IRanges
>> >>>>
>> >>>> Attaching package: 'IRanges'
>> >>>>
>> >>>>
>> >>>>     The following object(s) are masked from package:base :
>> >>>>
>> >>>>      cbind,
>> >>>>      Map,
>> >>>>      mapply,
>> >>>>      order,
>> >>>>      paste,
>> >>>>      pmax,
>> >>>>      pmax.int,
>> >>>>      pmin,
>> >>>>      pmin.int,
>> >>>>      rbind,
>> >>>>      rep.int,
>> >>>>      table
>> >>>>
>> >>>>> library(GenomicRanges)
>> >>>>> my.starts  = c(     10,    100,   1000 )
>> >>>>> my.ends    = c(     20,    200,   2000 )
>> >>>>> my.spaces  = c(    '1',    '2',    '3' )
>> >>>>> my.strands = c(    '+',    '+',    '-' )
>> >>>>> my.names   = c( 'seq1', 'seq2', 'seq3' )
>> >>>>> my.delta   = c(   1.23,   2.34,   3.45 )
>> >>>>>
>> >>>>> my.df = data.frame( start=my.starts, end=my.ends, space=my.spaces,
>> >>>> strand=my.strands, name=my.names, delta=my.delta )
>> >>>>> my.rd = as( my.df, 'RangedData' )
>> >>>>> my.gr = as( my.rd, 'GRanges' )
>> >>>>>
>> >>>>
>> >>>> # Extract the name field from each of these objects using [[
>> >>>>
>> >>>>> print( my.df[[ 'name' ]] )
>> >>>> [1] seq1 seq2 seq3
>> >>>> Levels: seq1 seq2 seq3
>> >>>>> print( my.rd[[ 'name' ]] )
>> >>>> [1] seq1 seq2 seq3
>> >>>> Levels: seq1 seq2 seq3
>> >>>>> print( my.gr[[ 'name' ]] )
>> >>>> Error in my.gr[["name"]] : missing '[[' method for Sequence class
>> GRanges
>> >>>>
>> >>>> # Extract the name field from each of these objects using $
>> >>>>
>> >>>>> print( my.df$'name' )
>> >>>> [1] seq1 seq2 seq3
>> >>>> Levels: seq1 seq2 seq3
>> >>>>> print( my.rd$'name' )
>> >>>> [1] seq1 seq2 seq3
>> >>>> Levels: seq1 seq2 seq3
>> >>>>> print( my.gr$'name' )
>> >>>> Error in x[[name, exact = FALSE]] :
>> >>>>   missing '[[' method for Sequence class GRanges
>> >>>>> sessionInfo()
>> >>>> R version 2.10.1 (2009-12-14)
>> >>>> x86_64-apple-darwin9.8.0
>> >>>>
>> >>>> locale:
>> >>>> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>> >>>>
>> >>>> attached base packages:
>> >>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> >>>>
>> >>>> other attached packages:
>> >>>> [1] GenomicRanges_1.0.8 IRanges_1.6.15
>> >>>> --------------------------------------------------------
>> >>>> This email is confidential and intended solely for the
>> u...{{dropped:15}}
>> >>>>
>> >>>> _______________________________________________
>> >>>> Bioconductor mailing list
>> >>>> Bioconductor at stat.math.ethz.ch
>> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>>> Search the archives:
>> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>>>
>> >> --------------------------------------------------------
>> >> This email is confidential and intended solely for the
>> u...{{dropped:12}}
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at stat.math.ethz.ch
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>> --------------------------------------------------------
>> This email is confidential and intended solely for the...{{dropped:13}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:   
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list