[R] Extracing only Unique Rows based on only 1 Column

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jan 17 00:06:09 CET 2010


Try this where DF is your data frame:

subset(DF, !duplicated(ID))

or equivalently:

DF[!duplicated(DF$ID), ]


On Sat, Jan 16, 2010 at 5:04 PM, Bryan M Hangartner
<hangartb at cecs.pdx.edu> wrote:
> To Whomever is Interested,
>
> I have spent several days searching the web, help files, the R wiki and the
> archives of this mailing list for a solution to this problem, but
> nonetheless I apologize in advance if I have missed something obvious.
>
> The problem is this; I have a 5-column data frame with about 4.2 million
> rows, and want to create a new (and hopefully much smaller) data frame that
> contains only the rows which have a unique value in the first column only.
> In other words, I do not care about the uniqueness of the values in the
> other four rows, only the uniqueness of the entries in the first row. The
> "unique" command does not seem to have this option available, at least based
> on what I've read in the help file.
>
> A simplified example matrix (designated as "traveltimes"):
>
> ID Time1 Time2
> 1    3     4
> 1    4     7
> 2    3     5
> 2    5     6
> 3    4     5
> 3    2     8
>
> When I use a command such as
>
> matches <- unique(traveltimes, incomparables = FALSE, fromLast = FALSE)
>
> I will end up with a 6-row matrix, exactly what I already have. What I would
> like to do is to remove the duplicate values in the column labeled "ID" and
> their associated Time1 and Time2 entries. This will give me a 3x3 matrix
> which contains only one instance of each "ID" variable. For the purposes of
> this particular problem, the uniqueness of the Time1 and Time2 rows is not
> relevant.
>
> If this question is not clear enough please let me know. Thank you for your
> time.
>
>
> --
> Bryan Hangartner
> hangartb at cecs.pdx.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list