[R] Counting

K. Elo maillists at nic.fi
Thu Mar 17 20:51:01 CET 2011


Dear Jim,

17.03.2011 20:54, Jim Silverton wrote:
> I have a matrix say:
>
> 23   1
> 12  12
> 0    0
> 0   1
> 0   1
> 0   2
> 23  2
>
> I want to count of number of distinct rows and the number of disinct element
> in the second column and put these counts in a column. SO at the end of the
> day I should have:
>
> c(1, 1, 1, 2, 2, 1, 1) for the distinct rows...

Let's suppose my.data is your data frame, "var" is the 1st column and 
"var1" is the second.

1) Create a 3rd columns for the first task:
    my.data$var2<-0
2) Count distinct rows:

    for (i in 1:nrow(my.data)) { my.data$var2[i]<-nrow(subset(my.data, 
var==var[i] & var1==var1[i])) }

After this, the output of "my.data$var2" is:

[1] 1 1 1 2 2 1 1

 > ... and c(1, 1, 1, 2, 2, 2, 2) for the counts of how many times the
 > elements in the second column exists.

Here I'm a bit irritated. Shouldn't the count for the first element "1" 
rather be 3, since the number 3 occurs three times... If this is what 
You are looking for, then the following should work:

1) Create a 4th column for:
    my.data$var3<-0
2) Count distinct elements in the second column:

    for (i in 1:nrow(my.data)) { 
my.data$var3[i]<-sum(my.data$var1==my.data$var1[i]) }

After this, the output of "my.data$var3" is:

[1] 3 1 1 3 3 2 2

HTH,
Kimmo



More information about the R-help mailing list