[R] counting duplicate items that occur in multiple groups

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue Nov 17 23:22:48 CET 2020


Inline.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <twoolman using ontargettek.com>
wrote:

> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> plus a bank account number for each vendor.


I interpret this as: "all vendors are unique and each vendor has a single
bank account." Is that correct?


> I'm trying to find a way
> to count the number of duplicate bank accounts that occur in more than
> one unique Vendor_ID,


The following makes no sense to me, as each row is a unique vendor and has
only one bank account.

> and then assign the count value for each row in
> the dataframe in a new variable.
>
> I can do a count of bank accounts that occur within the same vendor
>
using dplyr and group_by and count, but I can't figure out a way to
> count duplicates among multiple Vendor_IDs.
>
I interpret this to mean that you want to count vendor ID's by account .
With only one account per vendor
this is trivial; e.g.

set.seed(22)
d1 <- data.frame(id = sample(1:30),
      account = sample(1:20,30, replace = TRUE))

table(d1$account)

## gives
 1  2  3  6  7  8  9 10 11 13 15 16 17 18 19 20
 3  1  2  1  1  1  1  1  4  3  1  2  1  3  2  3

Note that AFAICS your example is useless, as it gives the same number of
different account numbers as ID's, so no duplication can occur.

As my interpretations are likely incorrect and this is not what you mean
nor want, either clarify your meaning and provide a useful **minimal**
example; or wait for a reply from someone with a better understanding than
I.

Cheers,
Bert





>
> Dataframe example code:
>
>
> #Create a sample data frame:
>
> set.seed(1)
>
> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> sample(1:10000))
>
>
>
>
> Thanks in advance for any help.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list