[R] how to count unique observations by variables

David Winsemius dwinsemius at comcast.net
Thu Oct 16 08:16:25 CEST 2008


On Oct 16, 2008, at 1:27 AM, Lijiang Guo wrote:

> Dear R-helpers,
>
> I have a data frame with 3 variables, each record is a unique  
> combination of
> the three variables. I would like to count the number of unique  
> values of v3
> in each v1, and save it as a new variable v4 in the same data frame.
> e.g.
> df1
>     [v1] [v2] [v3]
> [1,] "a"  "C"  "1"
> [2,] "b"  "C"  "2"
> [3,] "c"  "B"  "3"
> [4,] "a"  "B"  "3"
> [5,] "b"  "A"  "2"
> [6,] "c"  "A"  "1"
>
> In this case, the 4th column would become (2, 1, 2, 2, 1, 2).
 > txt <- '   v1 v2 v3
+  "a"  "C"  "1"
+  "b"  "C"  "2"
+  "c"  "B"  "3"
+  "a"  "B"  "3"
+  "b"  "A"  "2"
+  "c"  "A"  "1"'

df1 <- read.table(textConnection(txt), header=TRUE)

grps <- tapply(df1$v3, df1$v1,FUN=table)

# > sapply(grps,length)
# a b c
# 2 1 2

df1$v4 <- sapply(grps,length)[df1$v1]

df1

>   v1 v2 v3 v4
> 1  a  C  1  2
> 2  b  C  2  1
> 3  c  B  3  2
> 4  a  B  3  2
> 5  b  A  2  1
> 6  c  A  1  2

-- 
David Winsemius, MD
Heritage Labs

>
> Could someone tell me how to do this?
>
> regards,
> Lijiang
>
>
> --
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list