[R] Unique.data.frame...still getting duplicates

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jun 25 08:13:10 CEST 2004


Your code cannot possibly work in a recent version of R, so please try the
current version (1.9.1).

data[ID, ] is what?  Why not just call unique() on ID?

BTW, if you call methods such as unique.data.frame you are adding possible 
course of error -- here I suspect data[ID, ] is not what you intend.
Please call the generic.

On Fri, 25 Jun 2004, F Z wrote:

> Hi there
> 
> I have a data frame with about 65,000 rows and 8 variables.  I am trying to 
> get rid of the double entries of a factor variable "ID" so I can get a 
> unique observation for each ID
> 
> I tried:
> 
> >dupl_unique.data.frame(data[ID,]) #I obtain a data frame with 21,547 
> >observations..so far so good, but then when I check for duplicates
> 
> >d_duplicated(dupl2$ID)
> >summary(as.factor(d))
> FALSE  TRUE
>   6836 14711
> 
> Meaning that I am still getting 14,711 duplicates!
> 
> I tried changing the ID type to integer and repeated the process but I got 
> dentical results....what am I missing?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list