[R] Find "undirected" duplicates in a tibble

Fri Aug 20 17:43:46 CEST 2021

Hello,

This seems elegant to me but it's also the slowest, courtesy sort.

apply(x, 1, sort) |> t() |> unique()

(My tests show that for small inputs Greg's base apply is fastest, with 
nrow(x) > 700, Eric's dplyr is fastest)

Hope this helps,

Rui Barradas

Às 15:13 de 20/08/21, Greg Minshall escreveu:
> Eric,
> 
>> x %>% transmute( a=pmin(Source,Target), b=pmax(Source,Target)) %>%
>>    unique() %>% rename(Source=a, Target=b)
> 
> ah, very nice.  i have trouble remembering, e.g., unique().
> 
> fwiw, (hopefully) here's a baser version.
> ----
>    x = data.frame(Source=rep(1:3,4), Target=c(rep(1,3),rep(2,3),rep(3,3),rep(4,3)))
> 
>    y <- apply(x, 1, function(y) return (c(A=min(y), B=max(y))))
>    unique(t(y))
> ----
> 
> cheers, Greg
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>