[R] which element is duplicated?

William Dunlap wdunl@p @ending from tibco@com
Tue Nov 13 18:58:37 CET 2018


You also asked about doing this for the rows of a matrix.  unique() give
the unique rows but match operates on a per element, not per row,
basis.  You can use split, which operates on rows of a matrix, to help.

> m <- cbind( A=c(i=5,ii=5,iii=5,iv=4,v=4,vi=4), B=c(2,3,2,2,2,2) )
> unique(m)
   A B
i  5 2
ii 5 3
iv 4 2
> match(m, unique(m)) # bad
 [1] 1 1 1 3 3 3 4 5 4 4 4 4
> asRows <- function(x) split(x, seq_len(NROW(x))) # convert to list of rows
> match(asRows(m), unique(asRows(m)))
[1] 1 2 1 3 3 3


For data.frames unique works on rows but match works on columns, and
converting
to a list of rows does not quite work, because unique looks at the row
names.  A
modification of asRoiws works around that:

> d <- data.frame(m)
> unique(d)
   A B
i  5 2
ii 5 3
iv 4 2
> match(d, unique(d))
[1] NA NA
> asRows <- function(x) lapply(split(x, seq_len(NROW(x))), as.list)
> match(asRows(d), unique(asRows(d)))
[1] 1 2 1 3 3 3


Is this the sort of issue that Hadley's vectors package is addressing?

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Nov 13, 2018 at 2:15 AM, Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:

> On 13/11/2018 12:35 AM, Pages, Herve wrote:
>
>> Hi,
>>
>> On 11/12/18 17:08, Duncan Murdoch wrote:
>>
>>> The duplicated() function gives TRUE if an item in a vector (or row in
>>> a matrix, etc.) is a duplicate of an earlier item.  But what I would
>>> like to know is which item does it duplicate?
>>>
>>> For example,
>>>
>>> v <- c("a", "b", "b", "a")
>>> duplicated(v)
>>>
>>> returns
>>>
>>> [1] FALSE FALSE  TRUE  TRUE
>>>
>>> What I want is a fast way to calculate
>>>
>>>   [1] NA NA 2 1
>>>
>>> or (equally useful to me)
>>>
>>>   [1] 1 2 2 1
>>>
>>> The result should have the property that if result[i] == j, then v[i]
>>> == v[j], at least for i != j.
>>>
>>> Does this already exist somewhere, or is it easy to write?
>>>
>>
>> I generally use match() for that:
>>
>>   > v <- c("a", "b", "b", "a")
>>
>>   > match(v, v)
>>
>> [1] 1 2 2 1
>>
>
> Yes, this is perfect.  Thanks to you (and the private answer I received
> that suggested the same).
>
> Duncan Murdoch
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list