[Rd] Subsetting the "ROW"s of an object

Berry, Charles ccberry @ending from uc@d@edu
Fri Jun 8 20:38:42 CEST 2018



> On Jun 8, 2018, at 10:37 AM, Hervé Pagès <hpages using fredhutch.org> wrote:
> 
> Also the TRUEs cause problems if some dimensions are 0:
> 
>  > matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]
>  Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
>    (subscript) logical subscript too long

OK. But this is easy enough to handle. 

> 
> H.
> 
> On 06/08/2018 10:29 AM, Hadley Wickham wrote:
>> I suspect this will have suboptimal performance since the TRUEs will
>> get recycled. (Maybe there is, or could be, ALTREP, support for
>> recycling)
>> Hadley


AFAICS, it is not an issue. Taking

arr <- array(rnorm(2^22),c(2^10,4,4,4))

as a test case 

and using a function that will either use the literal code `x[i,,,,drop=FALSE]' or `eval(mc)':

subset_ROW4 <-
     function(x, i, useLiteral=FALSE)
{
    literal <- quote(x[i,,,,drop=FALSE])
    mc <- quote(x[i])
    nd <- max(1L, length(dim(x)))
    mc[seq(4,length=nd-1L)] <- rep(TRUE, nd-1L)
    mc[["drop"]] <- FALSE
    if (useLiteral)
        eval(literal)
    else
        eval(mc)
 }

I get identical times with

system.time(for (i in 1:10000) subset_ROW4(arr,seq(1,length=10,by=100),TRUE))

and with 

system.time(for (i in 1:10000) subset_ROW4(arr,seq(1,length=10,by=100),FALSE))

Changing the dimensions to c(2^5, 2^7, 4, 4 ) and running something similar also shows equal times.

Chuck

>> On Fri, Jun 8, 2018 at 10:16 AM, Berry, Charles <ccberry using ucsd.edu> wrote:
>>> 
>>> 
>>>> On Jun 8, 2018, at 8:45 AM, Hadley Wickham <h.wickham using gmail.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> Is there a better to way to subset the ROWs (in the sense of NROW) of
>>>> an vector, matrix, data frame or array than this?
>>> 
>>> 
>>> You can use TRUE to fill the subscripts for dimensions 2:nd
>>> 
>>>> 
>>>> subset_ROW <- function(x, i) {
>>>>  nd <- length(dim(x))
>>>>  if (nd <= 1L) {
>>>>    x[i]
>>>>  } else {
>>>>    dims <- rep(list(quote(expr = )), nd - 1L)
>>>>    do.call(`[`, c(list(quote(x), quote(i)), dims, list(drop = FALSE)))
>>>>  }
>>>> }
>>> 
>>> 
>>> subset_ROW <-
>>>     function(x,i)
>>> {
>>>     mc <- quote(x[i])
>>>     nd <- max(1L, length(dim(x)))
>>>     mc[seq(4, length=nd-1L)] <- rep(list(TRUE), nd - 1L)
>>>     mc[["drop"]] <- FALSE
>>>     eval(mc)
>>> 
>>> }
>>> 
>>>> 
>>>> subset_ROW(1:10, 4:6)
>>>> #> [1] 4 5 6
>>>> 
>>>> str(subset_ROW(array(1:10, c(10)), 2:4))
>>>> #>  int [1:3(1d)] 2 3 4
>>>> str(subset_ROW(array(1:10, c(10, 1)), 2:4))
>>>> #>  int [1:3, 1] 2 3 4
>>>> str(subset_ROW(array(1:10, c(5, 2)), 2:4))
>>>> #>  int [1:3, 1:2] 2 3 4 7 8 9
>>>> str(subset_ROW(array(1:10, c(10, 1, 1)), 2:4))
>>>> #>  int [1:3, 1, 1] 2 3 4
>>>> 
>>>> subset_ROW(data.frame(x = 1:10, y = 10:1), 2:4)
>>>> #>   x y
>>>> #> 2 2 9
>>>> #> 3 3 8
>>>> #> 4 4 7
>>>> 
>>> 
>>> HTH,
>>> 
>>> Chuck
>>> 
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages using fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the R-devel mailing list