[R] How to delete a duplicate observation

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Sep 13 20:17:23 CEST 2007


nuyaying wrote:
> I have a data set with 3 variables V1, V2, V3.  If there are 2 data points
> have the same values on both V1 and V2,  I want to delete one of them which
> has smaller V3 value.    i.e., in the data below, I want to delete 
> the first observation.  How can I do that ?    Thanks in advance!      
>
> V1  V2  V3
> 3    3     1
> 3    3     4
>
>   
Tricky one... I think something like this should work:

l <- split(d$V3, list(d$V1,d$V2))
ixl <- lapply(l, function(x) {
   if ((n <- nrow(x)) == 2)
      seq_len(n) != which.min(x)
   else
      rep(TRUE, n)
})
ix <- unsplit(ixl, list(d$V1,d$V2))
d[ix,]

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list