[R] merge a list of data frames

David Winsemius dwinsemius at comcast.net
Thu Sep 6 06:02:16 CEST 2012


On Sep 5, 2012, at 8:51 PM, Sam Steingold wrote:

> I have a list of data frames:
> 
>> str(data)
> List of 4
> $ :'data.frame':	700773 obs. of  3 variables:
>  ..$ V1: chr [1:700773] "200130446465779" "200070050127778" "200030633708779" "200010587002779" ...
>  ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ...
>  ..$ V3: num [1:700773] 1 1 1 1 1 ...
> $ :'data.frame':	700773 obs. of  3 variables:
>  ..$ V1: chr [1:700773] "200130446465779" "200070050127778" "200030633708779" "200010587002779" ...
>  ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ...
>  ..$ V3: num [1:700773] 1 1 1 1 1 ...
> $ :'data.frame':	700773 obs. of  3 variables:
>  ..$ V1: chr [1:700773] "200130446465779" "200070050127778" "200030633708779" "200010587002779" ...
>  ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ...
>  ..$ V3: num [1:700773] 1 1 1 1 1 ...
> $ :'data.frame':	700773 obs. of  3 variables:
>  ..$ V1: chr [1:700773] "200160325893778" "200130647544079" "200130446465779" "200120186959078" ...
>  ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ...
>  ..$ V3: num [1:700773] 1 1 1 1 1 1 1 1 1 1 ...
> 
> I want to merge them.

Why? What are you expecting?

> I tried to follow
> http://rwiki.sciviews.org/doku.php?id=tips%3adata-frames%3amerge
> and did:
> 
>> data.1 <- Reduce(function(f1,f2) merge(f1,f2,by=c("V1"),all=TRUE), data)
> Warning message:
> In merge.data.frame(f1, f2, by = c("V1"), all = TRUE) :
>  column names 'V2.x', 'V3.x', 'V2.y', 'V3.y' are duplicated in the result
>> str(data.1)
> 'data.frame':	700773 obs. of  9 variables:
> $ V1  : chr  "100010000099079" "100010000254078" "100010000499078" "100010000541779" ...
> $ V2.x: int  0 0 0 0 0 0 0 0 0 0 ...
> $ V3.x: num  0.476 0.748 0.442 0.483 0.577 ...
> $ V2.y: int  0 0 0 0 0 0 0 0 0 0 ...
> $ V3.y: num  0.476 0.748 0.442 0.483 0.577 ...
> $ V2.x: int  0 0 0 0 0 0 0 0 0 0 ...
> $ V3.x: num  0.476 0.752 0.443 0.485 0.578 ...
> $ V2.y: int  0 0 0 0 0 0 0 0 0 0 ...
> $ V3.y: num  0.47 0.733 0.57 0.416 0.616 ...
> 
> I don't like the warning and I don't like that I now have to use [n] to
> access identically named columns,

Perhaps it would make more sense to explain what your goal was, rather than showing us two divergent efforts, neither of which is satisfactory? Perhaps?

-- 
David.


> but, I guess, this is better than
> this:
> 
> library('reshape')
> 
>> data.1 <- merge_all(data,by="V1",all=TRUE)
> Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE,  : 
>  formal argument "all" matched by multiple actual arguments
>> data.1 <- merge_all(data,by="V1",sort=TRUE,all=TRUE)
> Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE,  : 
>  formal argument "all" matched by multiple actual arguments
>> data.1 <- merge_all(data,by="V1",sort=TRUE)
> Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE,  : 
>  formal argument "sort" matched by multiple actual arguments
>> data.1 <- merge_all(data,by="V1")
> Error in `[.data.frame`(df, , match(names(dfs[[1]]), names(df))) : 
>  undefined columns selected
>> data.1 <- merge_all(data,by=c("V1"))
> Error in `[.data.frame`(df, , match(names(dfs[[1]]), names(df))) : 
>  undefined columns selected
> 
> what does 'formal argument "sort" matched by multiple actual arguments' mean?
> 
> thanks.
> 
> -- 
> Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
> http://www.childpsy.net/ http://ffii.org http://pmw.org.il
> http://dhimmi.com http://palestinefacts.org http://iris.org.il
> I just forgot my whole philosophy of life!!!
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list