[R] which is the fastest way to make data.frame out of a three-dimensional array?

Bert Gunter gunter.berton at gene.com
Sat Feb 25 20:09:28 CET 2012


Petr:

Your expand.grid solution is clearly much better than my nonsense. It
is just as fast (or faster) and is the far more sensible thing to do.

For an array, ar, with dim(ar) = c(100,100,1000) , modifying your call
slightly to:

data.frame(c(ar),do.call(expand.grid,lapply(dim(ar),seq_len)))

I got:
  user  system elapsed
   1.93    0.43    2.38

Using my call I got:
   user  system elapsed
   2.23    0.44    2.70

Thanks for the help.

-- Bert

On Sat, Feb 25, 2012 at 9:55 AM, Petr Savicky <savicky at cs.cas.cz> wrote:
> On Sat, Feb 25, 2012 at 04:54:30PM +0100, Hans Ekbrand wrote:
>> foo <- rnorm(30*34*12)
>> dim(foo) <- c(30, 34, 12)
>>
>> I want to make a data.frame out of this three-dimensional array. Each dimension will be a variabel (column) in the data.frame.
>
> Hi.
>
> Try this
>
>  n1 <- dim(foo)[1]
>  n2 <- dim(foo)[2]
>  n3 <- dim(foo)[3]
>  df <- cbind(dat=c(foo), expand.grid(dim1=1:n1, dim2=1:n2, dim3=1:n3))
>  df[1:5, ]
>
>           dat dim1 dim2 dim3
>  1 -0.5765847    1    1    1
>  2  0.4490040    2    1    1
>  3  0.2626855    3    1    1
>  4  0.2206713    4    1    1
>  5  0.9079324    5    1    1
>  ...
>
> On the contrary to a previous suggestion with foo==foo, this
> works also in presence of NA.
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list