[R] Expanding upon expand.grid()

Tony Plate tplate at blackmesacapital.com
Thu May 8 05:57:04 CEST 2003


If you really need speed then it might make sense to think about what the 
algorithm is computing (or consider a different language).

If what you want is a vector containing the sum of every row in the output 
of expand.grid, then you could get it more efficiently by a recursive 
algorithm that computes just those sums, e.g.,

 > f <- function(x) {if (length(x)==1) return(x[[1]]) else {r <- f(x[-1]); 
return(rep(r,length(x[[1]]))+rep(x[[1]],each=length(r)))}}
 > f(rep(list(c(-1, 1)), 4))
  [1] -4 -2 -2  0 -2  0  0  2 -2  0  0  2  0  2  2  4
 > apply(expand.grid(rep(list(c(-1, 1)), 4)), 1, sum)
  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
-4 -2 -2  0 -2  0  0  2 -2  0  0  2  0  2  2  4
 >

And for a larger input list:
 > tt <- proc.time()[1]
 > x <- f(rep(list(c(-1, 1)), 24))
 > length(x)
[1] 16777216
 > proc.time()[1]-tt
[1] 1.84
 >

The size of the output is still exponential in the length of the input 
list, so you still can't do very long lists, but it's much more efficient 
than using expand.grid (an example 1/64 of the size takes 5 times longer):

 > tt <- proc.time()[1]
 > x <- apply(expand.grid(rep(list(c(-1, 1)), 18)), 1, sum)
 > length(x)
[1] 262144
 > proc.time()[1]-tt
[1] 9.45
 >

If you only need some statistics on the vector of sums, you might be able 
to think of a clever way of getting those.  The mean is easy :-)

To answer your direct question, yes it is possible to calculate the rows 
one at a time (in R or in any other general purpose programming 
language).  That would be like a variable-size-dial counter (in which each 
dial can have a different number of clicks).  R is probably not the best 
language for a fast implementation of such a thing.

hope this helps,

Tony Plate

At Thursday 09:38 AM 5/8/2003 +0700, Andrew Criswell wrote:
>Hello All:
>
>The function expand.grid() does nearly exactly what I want for
>permutation tests I wish to carry out, and it does so quickly when the
>number is kept small as in the example below:
>
>expand.grid(rep(list(c(-1, 1)), 3))
>
>   Var1 Var2 Var3
>1   -1   -1   -1
>2    1   -1   -1
>3   -1    1   -1
>4    1    1   -1
>5   -1   -1    1
>6    1   -1    1
>7   -1    1    1
>8    1    1    1
>
>Understandably, for a larger number--16, for example--the grid expands
>to a 65,536 x 16 = 1,048,576 element array of numbers.
>
>My question is this: is it possible to compute a grid like the one above
>on a row-by-row basis?  What I would like to do is to iteratively take
>one row at a time, compute its sum, store the computed sum for the row,
>and replace the latest row by the next row. In that way, memory burden
>and computation time would be reduced.
>
>Thanks for your help,
>ANDREW
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list