[R] Unique.data.frame...still getting duplicates

F Z gerifalte28 at hotmail.com
Fri Jun 25 16:50:48 CEST 2004


Thanks to Alec Stevenson, Andy Liaw and Prof. Brian Ripley.  I tried Alec's 
suggestion;

>data[!duplicated(data$ID),] d_duplicated(dupl$ID)
>summary(as.factor(d))
FALSE
21547 #it worked!

Thanks again!

>From: "Alec Stephenson" <astephen at efs.mq.edu.au>
>To: <gerifalte28 at hotmail.com>, <r-help at stat.math.ethz.ch>
>Subject: Re: [R] Unique.data.frame...still getting duplicates
>Date: Fri, 25 Jun 2004 12:45:26 +1000
>
>data[!duplicated(data$ID),]
>will do. Your unique(data[ID,]) removes duplicated rows in data[ID,],
>assuming the object ID exists.
>
>
>
>Alec Stephenson
>Department of Statistics
>Macquarie University
>NSW 2109, Australia
>
> >>> "F Z" <gerifalte28 at hotmail.com> 06/25/04 12:12pm >>>
>Hi there
>
>I have a data frame with about 65,000 rows and 8 variables.  I am
>trying to
>get rid of the double entries of a factor variable "ID" so I can get a
>
>unique observation for each ID
>
>I tried:
>
> >dupl_unique.data.frame(data[ID,]) #I obtain a data frame with 21,547
> >observations..so far so good, but then when I check for duplicates
>
> >d_duplicated(dupl2$ID)
> >summary(as.factor(d))
>FALSE  TRUE
>   6836 14711
>
>Meaning that I am still getting 14,711 duplicates!
>
>I tried changing the ID type to integer and repeated the process but I
>got
>dentical results....what am I missing?
>
>Thanks!
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
>http://www.R-project.org/posting-guide.html




More information about the R-help mailing list