[R] Create counter variable for subsets without a loop

Thomas Brambor tbrambor at stanford.edu
Mon May 17 23:32:57 CEST 2010


Hi all,

I am looking to create a rank variable based on a continuous variable
for subsets of the data. For example, for an R integrated data set
about US states this is how a loop could create what I want:

### Example with loop
data <- cbind(state.region,as.data.frame(state.x77))[,1:2]     #
choosing a subset of the data
data <- data[order(data$state.region, 1/data$Population),]    #
ordering the data
regions <- levels(data$state.region)
temp <- NULL
ranks <- NULL
for (i in 1:length(regions)){
    temp <- rev(rank(data[data$state.region==regions[i],"Population"]))
    ranks <- c(ranks,temp)
  }
data$rank <- ranks
data

where data$rank is the rank of the state by population within a region.

However, using loops is slow and cumbersome. I have a fairly large
data set with many subgroups and the loop runs a long time. Can
someone suggest a way to create such rank variable for subsets without
using a loop?

Thank you,
Thomas



More information about the R-help mailing list