[R] Creating one df from 85 df present in a list

Rasmus Liland jr@| @end|ng |rom po@teo@no
Sat Jun 13 01:54:17 CEST 2020


On 2020-06-10 13:14 -0700, Bert Gunter wrote:
> On Wed, Jun 10, 2020 at 11:48 AM Alejandro Ureta wrote:
> > 
> > hi, I am trying to fuse (cbind, merge... 
> > NOT rbind) several dataframes with 
> > different numbers of rows, all df 
> > included in a list, and using the code 
> > extract shown below. The function merge() 
> > works well with two df but not more than 
> > two...I have 85 dataframes to join in 
> > this way (85 df in the list)....could you 
> > please let me know how to get all 85 df 
> > merged ?,,,,, thanks
> >
> > fusion_de_tablas = merge(red_tablas_por_punto[["1 - Bv.Artigas y la Rambla
> > (Terminal CUTCSA)"]],
> > red_tablas_por_punto[["10 - Avenida Millán 2515 (Hospital Vilardebó)"]],
> > red_tablas_por_punto[["100 - Fauquet 6358 (Hospital Saint Bois)"]],
> > by= 'toma_de_muestras', all = T )
> 
> ?do.call  -- takes a list of arguments to a function
> ... as in
> do.call(merge, yourlist)  ## or similar perhaps

Dear Alejandro,

it would be easier to help you if you 
provided some example of how fusion_de_tablas 
looks like.  

Here is a small example on uniting some odd 
sized dataframes with some common and some 
differently named columns. 

	red_tablas_por_punto <-
	  list(
	    "1 - Bv.Artigas y la Rambla (Terminal CUTCSA)" =
	      data.frame("a"=1:3,
	                 "b"=4:6,
	                 "c"=4:6,
	                 'toma_de_muestras'=1),
	    "10 - Avenida Millán 2515 (Hospital Vilardebó)" =
	      data.frame("d"=4:8,
	                 "b"=8:12,
	                 'toma_de_muestras'=7),
	    "100 - Fauquet 6358 (Hospital Saint Bois)" =
	      data.frame("e"=100:101,
	                 "a"=85:86,
	                 'toma_de_muestras'=4)
	  )
	unified.df <- lapply(names(red_tablas_por_punto),
	  function(tabla, cn) {
	    x <- red_tablas_por_punto[[tabla]]
	    x[,cn[!(cn %in% colnames(x))]] <- NA
	    x <- x[,cn]
	    x$tabla <- tabla
	    return(x)
	  }, cn=unique(unlist(lapply(red_tablas_por_punto, colnames))))
	unified.df <- do.call(rbind, unified.df)
	unified.df

which yields

	    a  b  c toma_de_muestras  d   e                                         tabla
	1   1  4  4                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
	2   2  5  5                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
	3   3  6  6                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
	4  NA  8 NA                7  4  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
	5  NA  9 NA                7  5  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
	6  NA 10 NA                7  6  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
	7  NA 11 NA                7  7  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
	8  NA 12 NA                7  8  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
	9  85 NA NA                4 NA 100      100 - Fauquet 6358 (Hospital Saint Bois)
	10 86 NA NA                4 NA 101      100 - Fauquet 6358 (Hospital Saint Bois)

I also found that [1] you could use merge 
like you tried with Reduce, like 

	Reduce(function(x, y)
	  merge(x, y, by='toma_de_muestras', all=T),
	  red_tablas_por_punto)

which yields

	   toma_de_muestras a.x b.x  c  d b.y   e a.y
	1             10001   1   4  4 NA  NA  NA  NA
	2             10002   2   5  5 NA  NA  NA  NA
	3             10003   3   6  6 NA  NA  NA  NA
	4             10004  NA  NA NA  4   8  NA  NA
	5             10005  NA  NA NA  5   9  NA  NA
	6             10006  NA  NA NA  6  10  NA  NA
	7             10007  NA  NA NA  7  11  NA  NA
	8             10008  NA  NA NA  8  12  NA  NA
	9             10009  NA  NA NA NA  NA 100  85
	10            10010  NA  NA NA NA  NA 101  86

where the semi-common “a” column does not 
become unified ...  thus, I like my initial 
step-by-step apply-based solution better ... 

Best,
Rasmus

[1] https://stackoverflow.com/questions/22644780/merging-multiple-csv-files-in-r-using-do-call

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200613/f877bdec/attachment.sig>


More information about the R-help mailing list