[R] equivalent of group command of the egen function in Stata

Ista Zahn istazahn at gmail.com
Mon Dec 10 15:58:10 CET 2012


Hi,

On Mon, Dec 10, 2012 at 9:33 AM, Francesco Sarracino
<f.sarracino at gmail.com> wrote:
>
> Dear R listers,
>
> I am trying to create a new variable that uniquely identifies groups of
> observations in a dataset. So far I couldn't figure out how to do this in
> R. In Stata I would simply type:
> egen newvar = group(dim1, dim2, dim3)

A rough equivalent is

dat$group <- with(dat, interaction(dim1, dim2, dim3))

The differences between this and the Stata command are that the result
in R is a factor rather than numeric, and the default ordering is
different.

Best,
Ista
>
>
> Please, find below a quick example to show what I am dealing with:
> I have a dataset with 4 variables:
> var <- runif(50)   ## a variable that I want to group
> dim1 <- factor(rep(1:3, length.out= 50), labels = c("x","y","z") ) ## 3
> variables that should form the groups
> dim2 <- rep(1:2, length.out= 50)
> dim3 <- rep(1:5, length.out= 50)
>
> data <- data.frame(var, dim1, dim2, dim3)
>
> I am trying to build a fifth one (let's say: group_id) to uniquely identify
> groups of observations as defined by dim1, dim2 and dim3, i.e. 30  groups.
>
> can you please help me figuring out how to do it?
> thanks in advance,
> f.
>
> --
> Francesco Sarracino, Ph.D.
> https://sites.google.com/site/fsarracino/
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list