[R] "Best" way to merge 300+ .5MB dataframes?

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Sun Aug 10 23:22:06 CEST 2014


Just load the data frames into a list and give that list to rbind. It is way more efficient to be able to identify how big the final data frame is going to have to be at the beginning and preallocate the result memory than to incrementally allocate larger and larger data frames along the way using Reduce.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On August 10, 2014 11:51:22 AM PDT, Grant Rettke <gcr at wisdomandwonder.com> wrote:
>Good afternoon,
>
>Today I was working on a practice problem. It was simple, and perhaps
>even realistic. It looked like this:
>
>• Get a list of all the data files in a directory
>• Load each file into a dataframe
>• Merge them into a single data frame
>
>Because all of the columns were the same, the simplest solution in my
>mind was to `Reduce' the vector of dataframes with a call to
>`merge'. That worked fine, I got what was expected. That is key
>actually. It is literally a one-liner, and there will never be index
>or scoping errors with it.
>
>Now with that in mind, what is the idiomatic way? Do people usually do
>something else because it is /faster/ (by some definition)?
>
>Kind regards,
>
>Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
>gcr at wisdomandwonder.com | http://www.wisdomandwonder.com/
>“Wisdom begins in wonder.” --Socrates
>((λ (x) (x x)) (λ (x) (x x)))
>“Life has become immeasurably better since I have been forced to stop
>taking it seriously.” --Thompson
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list