[R] counting frequencies across two columns

David Winsemius dwinsemius at comcast.net
Sun Nov 1 13:48:21 CET 2009


On Nov 1, 2009, at 1:59 AM, Patrick Connolly wrote:

> On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:
>
>> I've got a data frame describing comments on an electronic journal,
>> wherein each row is a unique comment, like so:
>>
>> commentID  author articleID
>> 1         1   smith         2
>> 2         2   jones         3
>> 3         3 andrews         2
>> 4         4   jones         1
>> 5         5 johnson         3
>> 6         6   smith         2
>
> Let's call that dataframe x
>
>>
>> I want know the number of unique authors per article.  I can get a  
>> table
>> of article frequencies with table(articleID), but I can't figure  
>> out how
>> to count frequencies in a different column.  I'm sure there's an easy
>> way, but I guess I'm too new at this to find it.
>
> I'm not clear what you require, but maybe it's this:
>
>> with(x, table(articleID, author))
>
> articleID andrews johnson jones smith
>        1       0       0     1     0
>        2       1       0     0     2
>        3       0       1     1     0
>
> Is that anything like what you're after?

You've had two guesses so far and my guess increments the count.

Were you attempting to specify this?

df1 <- read.table(textConnection("commentID  author articleID
1         1   smith         2
2         2   jones         3
3         3 andrews         2
4         4   jones         1
5         5 johnson         3
6         6   smith         2"), header=T)

 > lapply( lapply(tapply(df1$author, df1$articleID, I), unique) ,  
length)
$`1`
[1] 1

$`2`
[1] 2

$`3`
[1] 2

Or delivered in matrix form (and using Connolly's approach as  
intermediate:

 > apply( with(df1, table(articleID, author)), 1, function(x) sum(x>0) )
1 2 3
1 2 2

>
>
> -- 
> ~ 
> .~ 
> .~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>   ___    Patrick Connolly
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list