[R] delete duplicated from data.frame

Sundar Dorai-Raj sundar.dorai-raj at PDF.COM
Tue May 18 16:38:11 CEST 2004



Christian Schulz wrote:

> Hi,
> 
> ?unique
> unique returns a vector, data frame or array like x * but with duplicate 
> elements removed *
> 
> what i'm doing wrong delete duplicated rows with same MEMEBRNO.  
> 
> februar <-  dmsegment[unique(dmsegment$MEMBERNO),]
> 
> This reduce from 197.188 rows to 184.199  but duplicated MEMBERNO  didn't left 
> all what a Primary Key setting in mysql me say and with a fix(februar) could 
> recognize .
> 
> Courious why MEMEBRNO 4,5 ,6 and 11  are left !
> dmsegment$MEMBERNO[1:10]
> [1] 1  4  5  6  7  9  10  11  16  21
> 
> februar$MEMBERNO[1:10]
> [1] 1  6  7  9  10  16  21  26  53  72 
> 
> Using unique with a single vector it works like i expect.
> 
> 
> P.S.
> i try -duplcated but get not better succes?
> 
> Many Thanks,
> Christian

Hi,
   Did you try unique(dmsegment$MEMBERNO) to see what you get? Looks 
like you are expecting an index. But, as you pointed out unique returns 
a vector, data frame or array like x *but with duplicate  elements 
removed*, which means unique returns the non-duplicated elements of 
dmsegment$MEMBERNO. To accomplish what I think you are trying to do, try:

februar <-  dmsegment[!duplicated(dmsegment$MEMBERNO), ]

Of course, this is a guess and may not be what you really want.

--sundar




More information about the R-help mailing list