[R] adding a dummy variable...

Dennis Murphy djmuser at gmail.com
Tue Oct 4 19:02:38 CEST 2011


Hi:

Here's another way to do it with the plyr package, also not terribly
elegant. It assumes that rel.head is a factor in your original data
frame:
> str(df)
'data.frame':   11 obs. of  2 variables:
 $ ID      : Factor w/ 6 levels "17100","17101",..: 1 1 2 3 4 4 5 5 5 6 ...
 $ rel.head: Factor w/ 3 levels "1","2","3": 1 3 1 1 1 2 1 2 3 1 ...

If this is not the case in your data, then you need to modify the
function f below accordingly. (This is why use of dput() is preferred
when sending example data to R-help, BTW.)

library('plyr')
f <- function(d) {
    tvec <- factor(c(1, 3), levels = 1:3)   # target vector
    if(nrow(d) != 2L) {d$dummy <- rep(0, nrow(d)); return(d)}
    # If the first if statement is FALSE, then the following code is run:
       d$dummy <- ifelse(!identical(d[, 2], tvec), 0, 1)
       d
   }

ddply(df, .(ID), f)

      ID rel.head dummy
1  17100        1     1
2  17100        3     1
3  17101        1     0
4  17102        1     0
5  17103        1     0
6  17103        2     0
7  17104        1     0
8  17104        2     0
9  17104        3     0
10 17105        1     1
11 17105        3     1

HTH,
Dennis

On Tue, Oct 4, 2011 at 8:44 AM,  <grazia at stat.columbia.edu> wrote:
> Hi all,
>
> I have a dataset of individuals where the variable ID corresponds to the
> identification of the household where the individual lives. rel.head stands
> for the relationship with the household head. so rel.head=1 is the household
> head, rel.head=2 is the spouse, rel.head=3 is the children.
>
> Here is an example to see how it looks like:
>
> df<-data.frame(ID=c("17100", "17100", "17101", "17102", "17103", "17103",
>                     "17104", "17104", "17104", "17105", "17105"),
>  rel.head=c("1","3","1","1","1", "2", "1", "2", "3", "1", "3"))
>
>
> I want to add a dummy variable that is equal to 1 when these conditions
> held simultaneously :
>
> a) the number of rows with same ID is equal to 2
> b) the variable rel.head=1 and rel.head=3
>
>
> So my ideal output is:
>
>   ID      rel.head   added.dummy
> 1  17100        1           1
> 2  17100        3           1
> 3  17101        1           0
> 4  17102        1           0
> 5  17103        1           0
> 6  17103        2           0
> 7  17104        1           0
> 8  17104        2           0
> 9  17104        3           0
> 10 17105        1           1
> 11 17105        3           1
>
> Is there a simple way to do that?
> Can somebody help?
>
> Thanks in advance,
> Grazia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list