[R] How to delete a duplicate observation

Thu Sep 13 20:58:37 CEST 2007

How about (assuming the data is in the data frame my.df):

> my.df2 <- my.df[order(my.df$V3, decreasing=TRUE),]
> my.df3 <- my.df2[ !duplicated( my.df2[,c('V1','V2')] ), ]

If order of the rows matters then we will need to add a couple of steps
to reorder.  You did not say what to do if 3 or more points matched,
this approach takes the largest single V3 value from all matching on V1
and V2.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of nuyaying
> Sent: Thursday, September 13, 2007 10:51 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] How to delete a duplicate observation
> 
> 
> 
> I have a data set with 3 variables V1, V2, V3.  If there are 
> 2 data points have the same values on both V1 and V2,  I want 
> to delete one of them which
> has smaller V3 value.    i.e., in the data below, I want to delete 
> the first observation.  How can I do that ?    Thanks in 
> advance!      
> 
> V1  V2  V3
> 3    3     1
> 3    3     4
> 
> --
> View this message in context: 
> http://www.nabble.com/How-to-delete-a-duplicate-observation-tf
> 4437033.html#a12659033
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>