[R] rbind of multiple data frames by column name, when each data frames can contain different columns

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Jun 2 09:46:02 CEST 2022


Hello,

And a base R only version.
Row binding code taken from StackOverflow [1].


rowbind <- function(x, y, all_cols = FALSE) {
   if(all_cols) {
     x[setdiff(names(y), names(x))] <- NA
     y[setdiff(names(x), names(y))] <- NA
   }
   rbind(x, y)
}

res3 <- Reduce(\(x, y) rowbind(x, y, all_cols = TRUE), df_list)
identical(res1, res3)
# [1] TRUE


[1] https://stackoverflow.com/a/46635610/8245406

Hope this helps,

Rui Barradas

Às 08:37 de 02/06/2022, Rui Barradas escreveu:
> Hello,
> 
> Here are two ways, both with dplyr::bind_rows.
> 
> 
> res1 <- Reduce(dplyr::bind_rows, df_list)
> res2 <- do.call(dplyr::bind_rows, df_list)
> identical(res1, res2)
> # [1] TRUE
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 07:12 de 02/06/2022, Stefano Sofia escreveu:
>> Dear R-list users,
>>
>> for each winter season from 2000 to 2022 I have a data frame 
>> collecting for different weather stations snowpack height (Hs), 
>> snowfall in the last 24h (Hn) and a validation flag.
>>
>> Suppose I have these three following data frames
>>
>>
>> df1 <- data.frame(data_POSIX=seq(as.POSIXct("2000-12-01", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), as.POSIXct("2000-12-05", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), by="1 days"), Station1_Hs = c(30, 
>> 40, 50, NA, 55), Station1_Hn = c(10, 20, 10, NA, 5), Station1_flag = 
>> c(0, 0, 0, NA, 0), Station2_Hs = c(20, 20, 30, 30, 0), Station2_Hn = 
>> c(0, 0, 10, 0, 5), Station2_flag = c(0, 0, 0, 1, 0))
>>
>>
>> df2 <- data.frame(data_POSIX=seq(as.POSIXct("2001-12-01", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), as.POSIXct("2001-12-05", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), by="1 days"), Station1_Hs = c(50, 
>> 60, 70, NA, NA), Station1_Hn = c(20, 20, 20, NA, NA), Station1_flag = 
>> c(0, 0, 0, NA, NA), Station3_Hs = c(20, 20, 30, 30, 0), Station3_Hn = 
>> c(0, 0, 10, 0, 5), Station3_flag = c(0, 0, 0, 1, 0))
>>
>>
>> df3 <- data.frame(data_POSIX=seq(as.POSIXct("2002-12-01", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), as.POSIXct("2002-12-05", 
>> format="%Y-%m-%d", tz="Etc/GMT-1"), by="1 days"), Station2_Hs = c(50, 
>> 60, 70, NA, NA), Station2_Hn = c(20, 20, 20, NA, NA), Station2_flag = 
>> c(0, 0, 0, NA, NA), Station3_Hs = c(20, 20, 30, 30, 0), Station3_Hn = 
>> c(0, 0, 10, 0, 5), Station3_flag = c(0, 0, 0, 1, 0))
>>
>>
>> As you can see, each data frame can have different stations loaded.
>>
>> I would need to call rbind matching data frames by column name (i.e. 
>> by station name), keeping in mind that the number of stations loaded 
>> in each data frame may differ. The result should be
>>
>> data_POSIX Station1_Hs Station1_Hn Station1_flag Station2_Hs 
>> Station2_Hn Station2_flag Station3_Hs Station3_Hn Station3_flag
>> 2000-12-01 30 10 0 20 0 0 NA NA NA
>> 2000-12-02 40 20 0 20 0 0 NA NA NA
>> 2000-12-03 50 10 0 30 10 0 NA NA NA
>> 2000-12-04 NA NA NA 30 0 0 NA NA NA
>> 2000-12-05 55 5 0 0 5 0 NA NA NA
>> 2001-12-01 50 20 0 NA NA NA 20 0 0
>> 2001-12-02 60 20 0 NA NA NA 20 0 0
>> 2001-12-03 70 20 0 NA NA NA 30 10 0
>> 2001-12-04 NA NA NA NA NA NA 30 0 1
>> 2001-12-05 NA NA NA NA NA NA 0 5 0
>> 2002-12-01 NA NA NA 50 20 0 20 0 0
>> 2002-12-02 NA NA NA 60 20 0 20 0 0
>> 2002-12-03 NA NA NA 70 20 0 30 10 0
>> 2002-12-04 NA NA NA NA NA NA 30 0 1
>> 2002-12-05 NA NA NA NA NA NA 0 5 0
>>
>> I tried this code
>>
>> df_list <- list(df1, df2, df3)
>> allNms <- unique(unlist(lapply(df_list, names)))
>> do.call(rbind, c(lapply(df_list, function(x) data.frame(c(x, 
>> sapply(setdiff(allNms, names(x)), function(y) NA)))), 
>> make.row.names=FALSE))
>>
>> but I get this error:
>> Error in (function (..., row.names = NULL, check.rows = FALSE, 
>> check.names = TRUE,  :
>>    arguments imply differing number of rows
>>
>> Could someone please help me?
>>
>>
>> Thank you for your attention
>>
>> Stefano
>>
>>
>>           (oo)
>> --oOO--( )--OOo--------------------------------------
>> Stefano Sofia PhD
>> Civil Protection - Marche Region - Italy
>> Meteo Section
>> Snow Section
>> Via del Colle Ameno 5
>> 60126 Torrette di Ancona, Ancona (AN)
>> Uff: +39 071 806 7743
>> E-mail: stefano.sofia using regione.marche.it
>> ---Oo---------oO----------------------------------------
>>
>> ________________________________
>>
>> AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere 
>> informazioni confidenziali, pertanto � destinato solo a persone 
>> autorizzate alla ricezione. I messaggi di posta elettronica per i 
>> client di Regione Marche possono contenere informazioni confidenziali 
>> e con privilegi legali. Se non si � il destinatario specificato, non 
>> leggere, copiare, inoltrare o archiviare questo messaggio. Se si � 
>> ricevuto questo messaggio per errore, inoltrarlo al mittente ed 
>> eliminarlo completamente dal sistema del proprio computer. Ai sensi 
>> dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di 
>> necessit� ed urgenza, la risposta al presente messaggio di posta 
>> elettronica pu� essere visionata da persone estranee al destinatario.
>> IMPORTANT NOTICE: This e-mail message is intended to be received only 
>> by persons entitled to receive the confidential information it may 
>> contain. E-mail messages to clients of Regione Marche may contain 
>> information that is confidential and legally privileged. Please do not 
>> read, copy, forward, or store this message unless you are an intended 
>> recipient of it. If you have received this message in error, please 
>> forward it to the sender and delete it completely from your computer 
>> system.
>>
>> -- 
>>
>> Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non 
>> infetto.
>>
>> This message was scanned by Libraesva ESG and is believed to be clean.
>>
>>
>>     [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list