[R] Removing rows that are duplicates but column values are in reversed order

arun smartpink111 at yahoo.com
Fri Apr 12 22:06:59 CEST 2013


Hi,
From your example data, 

dat1<- read.table(text="
id1   id2   value
a      b       10
c      d        11
b     a         10
c      e         12 
",sep="",header=TRUE,stringsAsFactors=FALSE)
#it is easier to get the output you wanted
dat1[!duplicated(dat1$value),]
#  id1 id2 value
#1   a   b    10
#2   c   d    11
#4   c   e    12

But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value)
dat2<- read.table(text="
id1   id2   value
a      b       10
c      d        11
b     a         10
e      c         12 
c      e         12 
",sep="",header=TRUE,stringsAsFactors=FALSE)
 dat2[apply(dat2[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),]
 # id1 id2 value
#1   a   b    10
#2   c   d    11
#5   c   e    12


#or you have cases like these:

dat3<- read.table(text="
id1   id2   value
a      b       10
c      d        11
b     a         10
a      b        10
e      c         12 
c      e         12
c      d         11 
",sep="",header=TRUE,stringsAsFactors=FALSE)

 dat3New<-dat3[apply(dat3[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),]
dat3New[!duplicated(dat3New$value),]
#  id1 id2 value
#1   a   b    10
#2   c   d    11
#6   c   e    12
A.K.




>Hi everybody, 
>
>I was hoping that someone could help me with this problem. I 
have a table with 3 columns. Some rows contain duplicates where the 
identifiers in >columns 1 and 2 are in reverse order, but the value 
associated with the row is the same. 
>
>For example: 
>
>id1   id2   value 
>a      b       10 
>c      d        11 
>b     a         10 
>c      e         12 
>
>Rows 1 and 3 are duplicates (have the same value). I would like 
to retain only row 1 and delete row 3. Final table should look like 
this: 
>
>id1   id2   value 
>a      b       10 
>c      d        11 
>c      e         12 
>
>Thanks in advance for any help provided. 
>
>Vince



More information about the R-help mailing list