[R] DROP OBSErVATIONs IN A GROUP

David Winsemius dwinsemius at comcast.net
Wed Jun 29 22:42:53 CEST 2011


On Jun 29, 2011, at 4:29 PM, Peter Maclean wrote:

> People with more experience in R I need help on this.
> I would like to drop observation if they meet certain condition. In  
> this example
> I would like to drop group 2 in "n" because the group in "Y" has  
> more than 2
> zeroes.
> #Example
> n <- c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3,3)
> y <- c(2,3,2,3,4,5,6,1,0,0,0,6, 2, 1, 0, 0,9,3)
> z <- as.data.frame(cbind(n,y))

The strategy of cbind vectors as an argument to data.frame and then  
naming them seems wasteful and error prone. Why not:

  z <- data.frame(n=factor(n),y=y)
# all one step, no issues about every element needing to be the same  
mode
# and not removing attributes that matrix class imposes.

The you can use ave() to return a group-computed counts of zeroes:

  z> with(z, ave(y, n, FUN=function(x) sum(x==0) ) )
  [1] 0 0 0 0 0 0 3 3 3 3 3 3 2 2 2 2 2 2

And us that to test for you condition for not dropping:

 > z[ with(z, ave(y, n, FUN=function(x) sum(x==0) ) ) <= 2, ]
    n y
1  1 2
2  1 3
3  1 2
4  1 3
5  1 4
6  1 5
13 3 2
14 3 1
15 3 0
16 3 0
17 3 9
18 3 3

> colnames(z) <- c("n","y")
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list