Fri May 18 16:23:24 CEST 2007

```According to your post you are assuming that there are only 3 unique
values for var3 within each category. But category C and D have 4 unique
values for var3.

split(dfr, dfr\$categ)
...
\$C
id categ var3 score
3   3     C    6  high
7   7     C    5   mid
11 11     C    3   low
15 15     C    1   low
...

If you meant something different, then just change myfun() below

gmax <- function(x, rnk=1){
## generalized maximum with rnk=1 being the bigest value (i.e. max)
return( sort( unique(x), decreasing=T )[rnk] )
}

myfun <- function(x){ ifelse( x==gmax(x,1), "high",
ifelse( x==gmax(x,2), "med", "low" ) ) }

out   <- lapply( split(dfr\$var3, dfr\$categ), myfun )

data.frame( dfr, my.score = unsplit(out, dfr\$categ) )

Lauri Nikkinen wrote:
> Hi R-users,
>
> I have a simple question for R heavy users. If I have a data frame like this
>
>
> dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> dfr <- dfr[order(dfr\$categ),]
>
> and I want to score values or points in variable named "var3" following this
> kind of logic:
>
> 1. the highest value of var3 within category (variable named "categ") ->
> "high"
> 2. the second highest value -> "mid"
> 3. lowest value -> "low"
>
> This would be the output of this reasoning:
>
> dfr\$score <-
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> dfr
>
> The question is how I do this programmatically in R (i.e. if I have 2000
> rows in my dfr)?
>
>
> Cheers,
> Lauri
>
>
