[R] Compute rank within factor groups

jim holtman jholtman at gmail.com
Thu Jul 12 21:34:49 CEST 2007


Is this what you are looking for:

> x
        report score
9         ADEA  0.96
8         ADEA  0.90
11 Asylum_FED9  0.86
3         ADEA  0.75
14 Asylum_FED9  0.60
5         ADEA  0.56
13 Asylum_FED9  0.51
16 Asylum_FED9  0.51
2         ADEA  0.42
7         ADEA  0.31
17 Asylum_FED9  0.27
1         ADEA  0.17
4         ADEA  0.17
6         ADEA  0.12
10        ADEA  0.11
12 Asylum_FED9  0.10
15 Asylum_FED9  0.09
18 Asylum_FED9  0.07
> x$rank <- ave(x$score, x$report, FUN=rank)
> x
        report score rank
9         ADEA  0.96 10.0
8         ADEA  0.90  9.0
11 Asylum_FED9  0.86  8.0
3         ADEA  0.75  8.0
14 Asylum_FED9  0.60  7.0
5         ADEA  0.56  7.0
13 Asylum_FED9  0.51  5.5
16 Asylum_FED9  0.51  5.5
2         ADEA  0.42  6.0
7         ADEA  0.31  5.0
17 Asylum_FED9  0.27  4.0
1         ADEA  0.17  3.5
4         ADEA  0.17  3.5
6         ADEA  0.12  2.0
10        ADEA  0.11  1.0
12 Asylum_FED9  0.10  3.0
15 Asylum_FED9  0.09  2.0
18 Asylum_FED9  0.07  1.0
>


On 7/12/07, Ken Williams <ken.williams at thomson.com> wrote:
> Hi,
>
> I have a data.frame which is ordered by score, and has a factor column:
>
>  Browse[1]> wc[c("report","score")]
>          report score
>  9         ADEA  0.96
>  8         ADEA  0.90
>  11 Asylum_FED9  0.86
>  3         ADEA  0.75
>  14 Asylum_FED9  0.60
>  5         ADEA  0.56
>  13 Asylum_FED9  0.51
>  16 Asylum_FED9  0.51
>  2         ADEA  0.42
>  7         ADEA  0.31
>  17 Asylum_FED9  0.27
>  1         ADEA  0.17
>  4         ADEA  0.17
>  6         ADEA  0.12
>  10        ADEA  0.11
>  12 Asylum_FED9  0.10
>  15 Asylum_FED9  0.09
>  18 Asylum_FED9  0.07
>  Browse[1]>
>
> I need to add a column indicating rank within each factor group, which I
> currently accomplish like so:
>
>  wc$rank <- 0
>  for(report in as.character(unique(wc$report))) {
>    wc[wc$report==report,]$rank <- 1:sum(wc$report==report)
>  }
>
> I have to wonder whether there's a better way, something that gets rid of
> the for() loop using tapply() or by() or similar.  But I haven't come up
> with anything.
>
> I've tried these:
>
>  by(wc, wc$report, FUN=function(pr){pr$rank <- 1:nrow(pr)})
>
>  by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank <-
> 1:nrow(pr)})
>
> But in both cases the effect of the assignment is lost, there's no $rank
> column generated for wc.
>
> Any suggestions?
>
>  -Ken
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list