[BioC] IRanges::Rle::[ arguably should return NA when i is out of range

Hervé Pagès hpages at fhcrc.org
Mon Mar 10 21:12:16 CET 2014


Hi Malcolm,

On one hand, Rle is a Vector subclass and subsetting an Rle is
consistent with subsetting any other Vector subclass (like IRanges,
GRanges, DNAStringSet, etc...) where trying to extract an element
that doesn't exist raises an error. Of course, this is the only
choice for those Vector derivative (that are not Rle's) because
they don't support NAs.

On the other hand, Rle is a memory efficient replacement for ordinary
atomic vectors, and, ideally, they should be substitutable (which
means the former should mimic the latter as close as possible).

A choice had to be made 5 years ago when Rle's were implemented
in the IRanges package. Should we revisit that choice and support
subsetting of an Rle by an out-of-bound subscript? Should we also
support NAs in the subscript?

Thanks,
H.


On 03/10/2014 11:52 AM, Cook, Malcolm wrote:
> I think the semantics of indexing into an Rles should be the same as indexing into the corresponding vector.
>
> But it is not in the case when indices are out of bounds.
>
> Example:
>
> library(IRanges)
>> c(1:10)[10:13]
> [1] 10 NA NA NA
>
>> Rle(c(1:10))[10:13]
> Error in normalizeSingleBracketSubscript(i, x) :
>    subscript contains NAs or out of bounds indices
>
>
> I've made similar arguments before which were taken up, as in https://stat.ethz.ch/pipermail/bioconductor/2013-September/054820.html
>
> Is there a good reason NOT to change these semantics?
>
> In the meantime, any suggested clever workarounds to get such extraction semantics?
>
> Thanks!
>
> ~ Malcolm Cook
> Computational Biology / Shilatifard Lab - Stowers Institute for Medical Research - Kansas City
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list