[R] RE : Create sequence for dataset

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sun Nov 21 23:57:49 CET 2004


ssim at lic.co.nz writes:

> Dear members,
> 
> I want to create a sequence of numbers for the multiple records of
> individual animal in my dataset. The SAS code below will do the trick, but
> I want to learn to do it in R. Can anyone help ?
> 
> data ht&ssn;
> set ht&ssn;
> by anml_key;
> if first.anml_key then do;
> seq_ht_rslt=0;
> end;
> seq_ht_rslt+1;
> 
> Thanks in advance.

Whoa. Who just said that SAS data step code was clearer than R? Quite
a bit of implicit knowledge in that one.

Here's one way (someone please think up a better name for ave()...):

> x <- numeric(nrow(airquality))
> ave(x, airquality$Month, FUN=function(z)seq(along=z))
  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31  1  2  3  4  5
 [37]  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 [55] 24 25 26 27 28 29 30  1  2  3  4  5  6  7  8  9 10 11
 [73] 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
 [91] 30 31  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
[109] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  1  2  3
[127]  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21
[145] 22 23 24 25 26 27 28 29 30

or, same basic idea but a little less cryptic:

> tb <- table(airquality$Month) 
> l <- lapply(tb, function(x)seq(length=x))
> unsplit(l, airquality$Month)   
  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31  1  2  3  4  5
(etc.)

or, brute force and ignorance:

> x <- numeric(nrow(airquality))
> for (i in unique(airquality$Month)) {
+   ix <- airquality$Month == i
+   x[ix] <- seq(along=x[ix])
+ }
> x
  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31  1  2  3  4  5
....

or, going to the opposite extreme (Gabor et al. are going to try and
beat me on this...):

> seq.factor <- function(f) ave(rep(1,length(f)),f,FUN=cumsum)
> seq(as.factor(airquality$Month))
  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31  1  2  3  4  5
....

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list