[R] ddply to count frequency of combinations

Brian Diggs diggsb at ohsu.edu
Tue Jun 21 21:54:40 CEST 2011


On 6/21/2011 11:30 AM, Idris Raja wrote:
> I have a dataframe df with two columns x and y. I want to count the number
> of times a unique x, y combination occurs.
>
> For example
>
> x<- c(1,2,3,4,5,1,2,3,4)
> y<- c(1,2,3,4,5,1,2,4,1)
>
> df<-as.data.frame(cbind(x, y))
>
> #what is the correct way to use ddply for this example?
> ddply(df, c('x','y', summarize, ??)
>
> #desired output -- format and order doesn't matter
> # (x, y) count
> #--------------------
> # (1, 1) 2
> # (2, 2) 2
> # (3, 3) 1
> # (4, 4) 1
> # (5, 5) 1
> # (2, 3) 1
> # (3, 4) 1
> # (4, 1) 1
>
> 	[[alternative HTML version deleted]]

Jorge and Dennis gave good responses that get you to the result you 
asked for, but for completeness I thought I'd include some ddply versions:

ddply(d, .(x, y), summarize, freq=length(x))

This uses the summarize function you were asking about, however you can 
also do it with:

ddply(d, .(x, y), nrow)

or

ddply(d, .(x, y), as.data.frame(nrow))

The latter giving a slightly nicer name (value instead of V1).

As an aside, I prefer using the "summarise" spelling of the function 
when I do use it, because it won't clash with Hmisc::summarize.

ddply(d, .(x, y), summarise, freq=length(x))


-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University



More information about the R-help mailing list