[Rd] Duplicate column names created by base::merge() when by.x has the same name as a column in y

Fri Feb 16 17:53:44 CET 2018

Hi Scott,

It seems like reasonable behavior to me. What result would you expect?
That the second "name" should be called "name.y"?

The "merge" documentation says:

    If the columns in the data frames not used in merging have any
    common names, these have ‘suffixes’ (‘".x"’ and ‘".y"’ by default)
    appended to try to make the names of the result unique.

Since the first "name" column was used in merging, leaving both
without a suffix seems consistent with the documentation...

Frederick

On Fri, Feb 16, 2018 at 09:08:29AM +1100, Scott Ritchie wrote:
> Hi,
> 
> I was unable to find a bug report for this with a cursory search, but would
> like clarification if this is intended or unavoidable behaviour:
> 
> ```{r}
> # Create example data.frames
> parents <- data.frame(name=c("Sarah", "Max", "Qin", "Lex"),
>                       sex=c("F", "M", "F", "M"),
>                       age=c(41, 43, 36, 51))
> children <- data.frame(parent=c("Sarah", "Max", "Qin"),
>                        name=c("Oliver", "Sebastian", "Kai-lee"),
>                        sex=c("M", "M", "F"),
>                        age=c(5,8,7))
> 
> # Merge() creates a duplicated "name" column:
> merge(parents, children, by.x = "name", by.y = "parent")
> ```
> 
> Output:
> ```
>    name sex.x age.x      name sex.y age.y
> 1   Max     M    43 Sebastian     M     8
> 2   Qin     F    36   Kai-lee     F     7
> 3 Sarah     F    41    Oliver     M     5
> Warning message:
> In merge.data.frame(parents, children, by.x = "name", by.y = "parent") :
>   column name ‘name’ is duplicated in the result
> ```
> 
> Kind Regards,
> 
> Scott Ritchie
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>