[R] Duplicates and duplicated
Linlin Yan
yanlinlin82 at gmail.com
Thu May 14 08:23:56 CEST 2009
On Thu, May 14, 2009 at 2:16 PM, christiaan pauw <cjpauw at gmail.com> wrote:
> Hi everybody.
> I want to identify not only duplicate number but also the original number
> that has been duplicated.
> Example:
> x=c(1,2,3,4,4,5,6,7,8,9)
> y=duplicated(x)
> rbind(x,y)
>
> gives:
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> x 1 2 3 4 4 5 6 7 8 9
> y 0 0 0 0 1 0 0 0 0 0
>
> i.e. the second 4 [,5] is a duplicate.
>
> What I want is the first and second 4. i.e [,4] and [,5] to be TRUE
>
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> x 1 2 3 4 4 5 6 7 8 9
> y 0 0 0 1 1 0 0 0 0 0
>
How about
rbind(x, duplicated(x) | duplicated(x, fromLast=TRUE))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x 1 2 3 4 4 5 6 7 8 9
0 0 0 1 1 0 0 0 0 0
> I assume it can be done by sorting the vector and then checking is the next
> or the previous entry matches using
> identical() . I am just unsure on how to write such a loop the logic of
> which (I think) is as follows:
>
> sort x
> for every value of x check if the next value is identical and return TRUE
> (or 1) if it is and FALSE (or 0) if it is not
> AND
> check is the previous value is identical and return TRUE (or 1) if it is and
> FALSE (or 0) if it is not
>
> Im i thinking correct and can some help to write such a function
>
> regards
> Christiaan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list