[BioC] subset GRanges object via ElementMetadata
Martin Morgan
mtmorgan at fhcrc.org
Sat Feb 23 01:27:40 CET 2013
On 02/22/2013 02:35 PM, Tim Triche, Jr. wrote:
> That's odd... I added an NA and sure enough, it fails:
>
> R> test.gr[ test.gr$over == 2 ]
> Error in IRanges:::normalizeSingleBracketSubscript(i, x) :
> subscript contains NAs
>
> But which() works fine:
>
> R> test.gr[ which(test.gr$over == 2) ]
> GRanges with 1 range and 3 metadata columns:
> seqnames ranges strand | edensity epeak over
> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
> [1] chr1 [762136, 763199] * | 1000 771 2
> ---
>
> I wonder if this is an easy fix, too?
In base R, subscripting with NA leads to
> x = 1:5
> x[NA]
[1] NA NA NA NA NA
which makes a weird sense (recycling a length 1 NA) but I/GRanges don't support
the notion of NA-ranges. So not implemented by design and hence not fixable is
probably the answer.
Martin
>
>
>
>
> On Fri, Feb 22, 2013 at 2:26 PM, Arnaud Amzallag
> <arnaud.amzallag at gmail.com>wrote:
>
>> test.gr[values(test.gr)$over %in% 2]
>>
>> works.
>>
>> test.gr[values(test.gr)$over == 2] works too if over does not contains
>> NAs.
>>
>> Arnaud
>>
>> On Feb 22, 2013, at 10:33 AM, Hermann Norpois wrote:
>>
>>> Hello,
>>>
>>> I am looking for a method to subset a GRangesObject by means of values
>> (or
>>> ElementMetadata column), for instance
>>> over==2.
>>>
>>> How does it work?
>>>
>>> Thanks
>>> Hermann
>>>
>>>
>>>> test.gr
>>> GRanges with 6 ranges and 3 metadata columns:
>>> seqnames ranges strand | edensity epeak over
>>> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
>>> [1] chr1 [713844, 714487] * | 1000 256 1
>>> [2] chr1 [762136, 763199] * | 1000 771 2
>>> [3] chr1 [780124, 780289] * | 519 74 0
>>> [4] chr1 [780533, 780677] * | 516 68 0
>>> [5] chr1 [781104, 781387] * | 601 140 0
>>> [6] chr1 [793830, 794396] * | 610 290 0
>>> ---
>>> seqlengths:
>>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX
>>> chrY
>>> NA NA NA NA NA NA ... NA NA NA NA NA
>>> NA
>>>> dput (test.gr)
>>> new("GRanges"
>>> , seqnames = new("Rle"
>>> , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12",
>>> "chr13",
>>> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2",
>>> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7",
>>> "chr8", "chr9", "chrX", "chrY"), class = "factor")
>>> , lengths = 6L
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , ranges = new("IRanges"
>>> , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L)
>>> , width = c(644L, 1064L, 166L, 145L, 284L, 567L)
>>> , NAMES = NULL
>>> , elementType = "integer"
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , strand = new("Rle"
>>> , values = structure(3L, .Label = c("+", "-", "*"), class = "factor")
>>> , lengths = 6L
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , elementMetadata = new("DataFrame"
>>> , rownames = NULL
>>> , nrows = 6L
>>> , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L,
>>> 601L, 610L
>>> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L,
>>> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over"))
>>> , elementType = "ANY"
>>> , elementMetadata = NULL
>>> , metadata = list()
>>> )
>>> , seqinfo = new("Seqinfo"
>>> , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14",
>>> "chr15",
>>> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21",
>>> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9",
>>> "chrX", "chrY")
>>> , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_)
>>> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>>> NA, NA,
>>> NA, NA, NA, NA, NA, NA, NA, NA, NA)
>>> , genome = c(NA_character_, NA_character_, NA_character_,
>>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_,
>> NA_character_,
>>> NA_character_, NA_character_, NA_character_, NA_character_, NA_character_
>>> )
>>> )
>>> , metadata = list()
>>> )
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list