[R] Convert list of data frames to one data frame

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Fri Jun 29 21:49:15 CEST 2018


> On Jun 29, 2018, at 7:28 AM, Sarah Goslee <sarah.goslee using gmail.com> wrote:
> 
> Hi,
> 
> It isn't super clear to me what you're after.

Agree.

Had a different read of ht erequest. Thought the request was for a first step that "harmonized" the names of the columns and then used `dplyr::bind_rows`:

library(dplyr)
 newList <- lapply( employees4List, 'names<-', names(employees4List[[1]]) ) 
 bind_rows(newList)

#---------

   first1 second1
1      Al   Jones
2     Al2   Jones
3    Barb   Smith
4     Al3   Jones
5 Barbara   Smith
6   Carol   Adams
7      Al  Jones2

Might want to wrap suppressWarnings around the right side of that assignment since there were many warnings regarding incongruent factor levels.

-- 
David.
> Is this what you intend?
> 
>> dfbycol(employees4BList)
>  first1 last1 first2 last2 first3 last3
> 1     Al Jones   <NA>  <NA>   <NA>  <NA>
> 2     Al Jones   Barb Smith   <NA>  <NA>
> 3     Al Jones   Barb Smith  Carol Adams
> 4     Al Jones   <NA>  <NA>   <NA>  <NA>
>> 
>> dfbycol(employees4List)
>  first1  last1  first2 last2 first3 last3
> 1     Al  Jones    <NA>  <NA>   <NA>  <NA>
> 2    Al2  Jones    Barb Smith   <NA>  <NA>
> 3    Al3  Jones Barbara Smith  Carol Adams
> 4     Al Jones2    <NA>  <NA>   <NA>  <NA>
> 
> 
> If so:
> 
> employees4BList = list(
> data.frame(first1 = "Al", second1 = "Jones"),
> data.frame(first1 = c("Al", "Barb"), second1 = c("Jones", "Smith")),
> data.frame(first1 = c("Al", "Barb", "Carol"), second1 = c("Jones",
> "Smith", "Adams")),
> data.frame(first1 = ("Al"), second1 = "Jones"))
> 
> employees4List = list(
> data.frame(first1 = ("Al"), second1 = "Jones"),
> data.frame(first2 = c("Al2", "Barb"), second2 = c("Jones", "Smith")),
> data.frame(first3 = c("Al3", "Barbara", "Carol"), second3 = c("Jones",
> "Smith", "Adams")),
> data.frame(first4 = ("Al"), second4 = "Jones2"))
> 
> ###
> 
> dfbycol <- function(x) {
>  x <- lapply(x, function(y)as.vector(t(as.matrix(y))))
>  x <- lapply(x, function(y){length(y) <- max(sapply(x, length)); y})
>  x <- do.call(rbind, x)
>  x <- data.frame(x, stringsAsFactors=FALSE)
>  colnames(x) <- paste0(c("first", "last"), rep(seq(1, ncol(x)/2), each=2))
>  x
> }
> 
> ###
> 
> dfbycol(employees4BList)
> 
> dfbycol(employees4List)
> 
> On Fri, Jun 29, 2018 at 2:36 AM, Ira Sharenow via R-help
> <r-help using r-project.org> wrote:
>> I have a list of data frames which I would like to combine into one data
>> frame doing something like rbind. I wish to combine in column order and
>> not by names. However, there are issues.
>> 
>> The number of columns is not the same for each data frame. This is an
>> intermediate step to a problem and the number of columns could be
>> 2,4,6,8,or10. There might be a few thousand data frames. Another problem
>> is that the names of the columns produced by the first step are garbage.
>> 
>> Below is a method that I obtained by asking a question on stack
>> overflow. Unfortunately, my example was not general enough. The code
>> below works for the simple case where the names of the people are
>> consistent. It does not work when the names are realistically not the same.
>> 
>> https://stackoverflow.com/questions/50807970/converting-a-list-of-data-frames-not-a-simple-rbind-second-row-to-new-columns/50809432#50809432
>> 
>> 
>> Please note that the lapply step sets things up except for the column
>> name issue. If I could figure out a way to change the column names, then
>> the bind_rows step will, I believe, work.
>> 
>> So I really have two questions. How to change all column names of all
>> the data frames and then how to solve the original problem.
>> 
>> # The non general case works fine. It produces one data frame and I can
>> then change the column names to
>> 
>> # c("first1", "last1","first2", "last2","first3", "last3",)
>> 
>> #Non general easy case
>> 
>> employees4BList = list(data.frame(first1 = "Al", second1 = "Jones"),
>> 
>> data.frame(first1 = c("Al", "Barb"), second1 = c("Jones", "Smith")),
>> 
>> data.frame(first1 = c("Al", "Barb", "Carol"), second1 = c("Jones",
>> "Smith", "Adams")),
>> 
>> data.frame(first1 = ("Al"), second1 = "Jones"))
>> 
>> employees4BList
>> 
>> bind_rows(lapply(employees4BList, function(x) rbind.data.frame(c(t(x)))))
>> 
>> # This produces a nice list of data frames, except for the names
>> 
>> lapply(employees4BList, function(x) rbind.data.frame(c(t(x))))
>> 
>> # This list is a disaster. I am looking for a solution that works in
>> this case.
>> 
>> employees4List = list(data.frame(first1 = ("Al"), second1 = "Jones"),
>> 
>> data.frame(first2 = c("Al2", "Barb"), second2 = c("Jones", "Smith")),
>> 
>> data.frame(first3 = c("Al3", "Barbara", "Carol"), second3 = c("Jones",
>> "Smith", "Adams")),
>> 
>> data.frame(first4 = ("Al"), second4 = "Jones2"))
>> 
>>  bind_rows(lapply(employees4List, function(x) rbind.data.frame(c(t(x)))))
>> 
>> Thanks.
>> 
>> Ira
>> 
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law




More information about the R-help mailing list