[R] dataframe indexing by number of cases per group
djmuser at gmail.com
Thu Nov 24 14:47:04 CET 2011
A very similar question was asked a couple of days ago - see the
thread titled "Removing rows in dataframe w'o duplicated values" - in
particular, the responses by Dimitris Rizopoulos and David Winsemius.
The adaptation to this problem is
df[ave(as.numeric(df$group), as.numeric(df$group), FUN = length) > 4, ]
1 A 3.903747
2 A 3.599547
3 A 2.449991
4 A 2.740639
5 A 4.268988
6 B 8.649600
7 B 5.493841
8 B 1.892154
9 B 6.781754
10 B 1.459250
11 B 6.749522
On Thu, Nov 24, 2011 at 4:02 AM, Johannes Radinger <JRadinger at gmx.at> wrote:
> assume we have following dataframe:
> group <-c(rep("A",5),rep("B",6),rep("C",4))
> x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15))
> df <- data.frame(group,x)
> Now I want to select all cases (rows) for those groups
> which have more or equal 5 cases (so I want to select
> all cases of group A and B).
> How can I use the indexing for such questions?
> df[??]... I think it is probably quite easy but I really
> don't know how to do that at the moment.
> maybe someone can help me...
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help