[R] Quicker way of combining vectors into a data.frame

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Nov 30 18:00:07 CET 2006


Hi,

In a function, I compute 10 (un-named) vectors of reasonable length
(4471 in the particular example I have to hand) that I want to combine
into a data frame object, that the function will return.

This is very slow, so *I'm* doing something wrong if I want it to be
quick and efficient, though I'm not sure what the best way to do this
would be.

I know it is the combining into data frame bit that is slow, because
I've Rprof'ed it:

$by.self
                        self.time self.pct total.time total.pct
"names<-.default"           16.58     52.8      16.58      52.8
"unlist"                     7.22     23.0       7.26      23.1
"data.frame"                 1.72      5.5      29.38      93.6
"duplicated.default"         1.66      5.3       1.66       5.3
"+"                          1.20      3.8       1.20       3.8
"list"                       0.40      1.3       0.40       1.3
"as.data.frame.numeric"      0.28      0.9       3.32      10.6
"apply"                      0.26      0.8       1.70       5.4
"pmatch"                     0.22      0.7       0.22       0.7
"paste"                      0.20      0.6       0.90       2.9
"deparse"                    0.14      0.4       0.70       2.2
"eval"                       0.12      0.4      31.28      99.7
"names<-"                    0.12      0.4      16.70      53.2
"FUN"                        0.12      0.4       1.32       4.2
"names"                      0.12      0.4       0.14       0.4
"as.list.default"            0.12      0.4       0.12       0.4
"duplicated"                 0.10      0.3       1.76       5.6
"gc"                         0.10      0.3       0.10       0.3

And I stepped through it under debug() and all the calculations before
are quick, and then this bit takes a little over 20 seconds to complete

 fab <- data.frame(lc.ratio = lc.ratio, Q = Q,
                     fNupt = fNupt,
                     rho.n = rho.n, rho.s = rho.s,
                     net.Nimm = net.Nimm,
                     net.Nden = net.Nden,
                     CLminN = CLminN,
                     CLmaxN = CLmaxN,
                     CLmaxS = CLmaxS)

I can get it down to c. 5 seconds if I do (not Rprof'ed):

 fab <- data.frame(lc.ratio, Q,
                     fNupt,
                     rho.n, rho.s,
                     net.Nimm,
                     net.Nden,
                     CLminN,
                     CLmaxN,
                     CLmaxS)

But this still seems quite a long time, so I'm thinking that there must
be a quicker of doing what I want (end up with a data.frame with the 10
vectors in it).

Can anyone enlighten me?

> version
               _                                          
platform       i686-pc-linux-gnu                          
arch           i686                                       
os             linux-gnu                                  
system         i686, linux-gnu                            
status         Patched                                    
major          2                                          
minor          4.0                                        
year           2006                                       
month          10                                         
day            03                                         
svn rev        39576                                      
language       R                                          
version.string R version 2.4.0 Patched (2006-10-03 r39576)

> sessionInfo()
R version 2.4.0 Patched (2006-10-03 r39576) 
i686-pc-linux-gnu 

locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"
"datasets" 
[7] "base"

Thanks in advance,

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list