[R] adding a dummy variable...

baptiste auguie baptiste.auguie at googlemail.com
Tue Oct 4 21:09:02 CEST 2011


Hi,

Using ddply,

ddply(df, .(ID), mutate, nrows=length(rel.head), test = nrows==2 &
all(rel.head %in% c(1,3)))

HTH,

baptiste


On 5 October 2011 06:02, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> Here's another way to do it with the plyr package, also not terribly
> elegant. It assumes that rel.head is a factor in your original data
> frame:
>> str(df)
> 'data.frame':   11 obs. of  2 variables:
>  $ ID      : Factor w/ 6 levels "17100","17101",..: 1 1 2 3 4 4 5 5 5 6 ...
>  $ rel.head: Factor w/ 3 levels "1","2","3": 1 3 1 1 1 2 1 2 3 1 ...
>
> If this is not the case in your data, then you need to modify the
> function f below accordingly. (This is why use of dput() is preferred
> when sending example data to R-help, BTW.)
>
> library('plyr')
> f <- function(d) {
>    tvec <- factor(c(1, 3), levels = 1:3)   # target vector
>    if(nrow(d) != 2L) {d$dummy <- rep(0, nrow(d)); return(d)}
>    # If the first if statement is FALSE, then the following code is run:
>       d$dummy <- ifelse(!identical(d[, 2], tvec), 0, 1)
>       d
>   }
>
> ddply(df, .(ID), f)
>
>      ID rel.head dummy
> 1  17100        1     1
> 2  17100        3     1
> 3  17101        1     0
> 4  17102        1     0
> 5  17103        1     0
> 6  17103        2     0
> 7  17104        1     0
> 8  17104        2     0
> 9  17104        3     0
> 10 17105        1     1
> 11 17105        3     1
>
> HTH,
> Dennis
>
> On Tue, Oct 4, 2011 at 8:44 AM,  <grazia at stat.columbia.edu> wrote:
>> Hi all,
>>
>> I have a dataset of individuals where the variable ID corresponds to the
>> identification of the household where the individual lives. rel.head stands
>> for the relationship with the household head. so rel.head=1 is the household
>> head, rel.head=2 is the spouse, rel.head=3 is the children.
>>
>> Here is an example to see how it looks like:
>>
>> df<-data.frame(ID=c("17100", "17100", "17101", "17102", "17103", "17103",
>>                     "17104", "17104", "17104", "17105", "17105"),
>>  rel.head=c("1","3","1","1","1", "2", "1", "2", "3", "1", "3"))
>>
>>
>> I want to add a dummy variable that is equal to 1 when these conditions
>> held simultaneously :
>>
>> a) the number of rows with same ID is equal to 2
>> b) the variable rel.head=1 and rel.head=3
>>
>>
>> So my ideal output is:
>>
>>   ID      rel.head   added.dummy
>> 1  17100        1           1
>> 2  17100        3           1
>> 3  17101        1           0
>> 4  17102        1           0
>> 5  17103        1           0
>> 6  17103        2           0
>> 7  17104        1           0
>> 8  17104        2           0
>> 9  17104        3           0
>> 10 17105        1           1
>> 11 17105        3           1
>>
>> Is there a simple way to do that?
>> Can somebody help?
>>
>> Thanks in advance,
>> Grazia
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list