[R] How to group by and get distinct rows of of grouped rows based on certain criteria

Sarah Goslee sarah.goslee at gmail.com
Thu Jul 14 22:50:20 CEST 2016


I took a wild guess as to what your data looked like (please use
dput() to provide data, and please do not post in HTML), and took your
request literally.

Here's one way to approach the problem:


mydat <- structure(list(ATP.Group = c("02", "02", "02", "ZM", "ZM", "ZM",
"02", "02", "02"), Business.Event = c("A", "A", "A", "A", "A",
"A", "B", "B", "B"), Category = c("AC", "AD", "EQ", "AU", "AV",
"AW", "AC", "AY", "EQ")), .Names = c("ATP.Group", "Business.Event",
"Category"), class = "data.frame", row.names = c(NA, -9L))


hasEQ <- subset(mydat, Category == "EQ")
hasEQ <- unique(do.call("paste", c(hasEQ[, c("ATP.Group",
"Business.Event")], sep="/")))

notEQ <- subset(mydat, Category != "EQ")
notEQ <- unique(do.call("paste", c(notEQ[, c("ATP.Group",
"Business.Event")], sep="/")))
notEQ <- notEQ[!(notEQ %in% hasEQ)]

> hasEQ
[1] "02/A" "02/B"
> notEQ
[1] "ZM/A"

Sarah

On Thu, Jul 14, 2016 at 3:43 PM, Satish Vadlamani
<satish.vadlamani at gmail.com> wrote:
> Hello All:
> I would like to get your help on the following problem.
>
> I have the following data and the first row is the header. Spaces are not
> important.
> I want to find out distinct combinations of ATP Group and Business Event
> (these are the field names that you can see in the data below) that have
> the Category EQ (Category is the third field) and those that do not have
> the category EQ. In the example below, the combinations 02/A and 02/B have
> EQ and the combination ZM/A does not.
>
> If I have a larger file, how to get to this answer?
>
> What did I try (with dplyr)?
>
> # I know that the below is not correct and not giving desired results
> file1_1 <- file1  %>% group_by(ATP.Group,Business.Event) %>%
> filter(Category != "EQ") %>% distinct(ATP.Group,Business.Event)
> # for some reason, I have to convert to data.frame to print the data
> correctly
> file1_1 <- as.data.frame(file1_1)
> file1_1
>
>
> *Data shown below*
> |ATP Group|Business Event|Category|
> |02       |A             |AC      |
> |02       |A             |AD      |
> |02       |A             |EQ      |
> |ZM       |A             |AU      |
> |ZM       |A             |AV      |
> |ZM       |A             |AW      |
> |02       |B             |AC      |
> |02       |B             |AY      |
> |02       |B             |EQ      |
>
> --



More information about the R-help mailing list