[Rd] `merge()` not consistent in how it treats list columns

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Sat Jan 2 21:56:35 CET 2021


Antoine,

Have you considered converting the non-list to a list explicitly so this
does not matter?

For a long time, few people used lists in this context, albeit in the
tidyverse it is now better supported and probably more common.

This is an area many have found annoying when you have implicit conversions.
What if one ID field was character and the other was numeric? In some
languages the conversion always goes to character (as in R) but in some it
might go numeric in one direction and in some it may refuse and demand you
convert it yourself. 

Do you suggest that a unique solution exists for complex cases so that the
software should know you want to convert a vector to list? What if one side
is a list containing a list containing a list, many levels deep and the
other has no or fewer or more levels. Is it obvious to take the deepest case
and change all others to match? Do you lose things in the process?

When things may not work, sure you can suggest someone change, but you can
consider it as a case where YOU should make sure the types are compatible
before a merge. 



-----Original Message-----
From: R-devel <r-devel-bounces using r-project.org> On Behalf Of Antoine Fabri
Sent: Saturday, January 2, 2021 2:16 PM
To: R-devel <r-devel using r-project.org>
Subject: [Rd] `merge()` not consistent in how it treats list columns

Dear R-devel,

When trying to merge 2 data frames by an "id" column, with this column a
character in one of them, and a list of character in the other, merge
behaves differently depending which is given first.

Example :

```
df1 <- data.frame(a=1)
df2 <- data.frame(b=2)
df1$id <- "ID"
df2$id <- list("ID")

# these print in a similar way, so the upcoming error will be hard to
diagnose
df1
#>   a id
#> 1 1 ID
df2
#>   b id
#> 1 2 ID

# especially as this works well, df2$id is treated as an atomic vector
merge(df1, df2)
#>   id a b
#> 1 ID 1 2

# But this fails with a cryptic error message merge(df2, df1) #> Error in
sort.list(bx[m$xi]): 'x' must be atomic for 'sort.list', method "shell" and
"quick"
#> Have you called 'sort' on a list?
```

I believe that if we let it work one way it should work the other, and that
if it works neither an explicit error  mentioning how we can't join by list
column would be helpful.

Many thanks and happy new year to all the R community,

Antoine

	[[alternative HTML version deleted]]

______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list