[R] Numbering entries for each subject

Toni Pitcher toni.pitcher at otago.ac.nz
Thu Sep 22 05:02:02 CEST 2011


Hi R Users

I am hoping someone might be able to give some pointers on alternative code to the for loop described below.

I have a dataset which is ordered by subject ID and date, what I would like to do is create a new variable that numbers the entries for each person (e.g. 1,2,3,....)

As an example if we have subjects A, B and C all with multiple entries (have excluded date variable for simplicity), the for loop below achieves the desired result, however my dataset is big (1 million + observations) and the for loop is slow. Is there a more efficient way of getting to the desired result?

Many thanks in advance

Toni 


A <- data.frame(ID=c('A','A','A','A','B','B','B', 'C','C','C','C','C'))

  ID
1   A
2   A
3   A
4   A
5   B
6   B
7   B
8   C
9   C
10  C
11  C
12  C


A$Session_ID <- 0
previous_ID <- ''
current_index <- 1
for ( i in seq(1,nrow(A)) )
{
 if (A$ID[i] != previous_ID) 
    {current_index <- 1} 
 A$Session_ID[i] <- current_index
 previous_ID <- A$ID[i]
 current_index <- current_index + 1
}

 

ID Session_ID
1   A          1
2   A          2
3   A          3
4   A          4
5   B          1
6   B          2
7   B          3
8   C          1
9   C          2
10  C          3
11  C          4
12  C          5


More information about the R-help mailing list