[R] Quicker way of combining vectors into a data.frame

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Dec 1 13:06:37 CET 2006


On Fri, 2006-12-01 at 12:13 +0100, Peter Dalgaard wrote:
> Gavin Simpson wrote:
<snip />
> >
> > I just don't understand what is going on with data.frame.
> >
> >   
> I think there is something about the data you're not telling us...

Yes, that I was doing something very, very silly that I thought would
work (produce a vector CLmaxN of the required length), but was in fact
blowing out to a huge named list. It was this that was causing the
massive increase in computation time in data.frame over cbind.

After correcting my mistake, timings for data.frame are:

system.time(fab <- data.frame(lc.ratio, Q,
+                      fNupt,
+                      rho.n, rho.s,
+                      net.Nimm,
+                      net.Nden,
+                      CLminN,
+                      CLmaxN,
+                      CLmaxS))
[1] 0.012 0.000 0.011 0.000 0.000
Browse[1]> system.time(fab <- data.frame(lc.ratio = lc.ratio, Q = Q,
+                      fNupt = fNupt,
+                      rho.n = rho.n, rho.s = rho.s,
+                      net.Nimm = net.Nimm,
+                      net.Nden = net.Nden,
+                      CLminN = CLminN,
+                      CLmaxN = CLmaxN,
+                      CLmaxS = CLmaxS))
[1] 0.008 0.000 0.018 0.000 0.000

One vector has names for some reason, removing them brings the un-named
data.frame version down to the named version timing and makes no
difference to the named version

Browse[1]> names(CLmaxS) <- NULL
Browse[1]> system.time(fab <- data.frame(lc.ratio, Q,
+                      fNupt,
+                      rho.n, rho.s,
+                      net.Nimm,
+                      net.Nden,
+                      CLminN,
+                      CLmaxN,
+                      CLmaxS))
[1] 0.008 0.000 0.016 0.000 0.000
Browse[1]> system.time(fab <- data.frame(lc.ratio = lc.ratio, Q = Q,
+                      fNupt = fNupt,
+                      rho.n = rho.n, rho.s = rho.s,
+                      net.Nimm = net.Nimm,
+                      net.Nden = net.Nden,
+                      CLminN = CLminN,
+                      CLmaxN = CLmaxN,
+                      CLmaxS = CLmaxS))
[1] 0.008 0.000 0.009 0.000 0.000

Apologies to the list for bothering you all with my stupidity and thank
you again to everyone who replied - I knew it was I who was doing
something wrong, but couldn't see it and thanks to your comments,
suggestions and queries I was able to work out what that was.

All the best,

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list