[R] intersect() without discarding duplicates?

Gabor Grothendieck ggrothendieck at gmail.com
Fri May 21 03:55:30 CEST 2010


Try this one liner.  The first argument of rep is the sorted
intersection and the second argument is the calculated from the
parallel minimum of the counts of elements in a that are also in b and
the counts of elements in b that are also in a.

rep(sort(intersect(a, b)), pmin(table(a[a %in% b]), table(b[b %in% a])))

It does have the advantage of not introducing factors.

On Thu, May 20, 2010 at 6:24 PM, Jonathan <jonsleepy at gmail.com> wrote:
> Thanks, but that doesn't quite work, since I'd want the result of b[b %in%
> a] to be symmetric with a[a%in%b] (so if there are two 2's in EACH vector,
> I'll get two 2's in the result, but if there are two 2's in only one vector,
> but one two in the other, the result will show only one 2.
>
> Consider:
>
>> a <- c(2,4,3)
>> b<-c(6,6,5,2,2,8,4)
>
>> b[b %in% a]
> [1] 2 2 4
>
>> a[a%in%b]
> [1] 2 4
>
> The second answer is correct, but I can't predict which variable to put in
> which position in the statement, so I'd need them both to be correct.
>
> Best,
> Jonathan
>
> On Thu, May 20, 2010 at 6:10 PM, David Winsemius <dwinsemius at comcast.net>wrote:
>
>>
>> On May 20, 2010, at 5:58 PM, Jonathan wrote:
>>
>>  Hi all,
>>>  The ?intersect entry kindly points out that it discards duplicate
>>> entries.  I'm looking, however, to get the intersection while KEEPING
>>> duplicate entries, and there are no instructions on how to accomplish this
>>> using intersect().
>>>
>>> Does anybody have any idea how this might be done, or am I going to need
>>> to
>>> program something from scratch (something like ordering the vectors and
>>> then
>>> looping through them)?
>>>
>>>
>>>
>>> ex:
>>>
>>>  a <- c(2,4,2,3)
>>>> b<-c(6,6,5,2,2,8,4)
>>>> intersect(a,b)
>>>>
>>> [1] 2 4
>>>
>>
>> > b %in% a
>> [1] FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE
>>
>> # Now use logical indexing on "b"
>>
>> > b[b %in% a]
>> [1] 2 2 4
>>
>>
>>>
>>>
>>> I'd hope the answer to be 2 2 4.
>>>
>>> Regards,
>>> Jonathan
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list