[R] How to generate a conditional dummy in R?

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue May 29 01:06:06 CEST 2018


I was not able to decipher your meaning, and your failure to garner a
response so far may mean that others may have had similar difficultes. You
might therefore try providing providing a **small reproducible** example of
X's and Y's that show what you have and what you want to end up with to
clarify your meaning. Or continue to wait for someone smarter to respond.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, May 28, 2018 at 3:43 AM, Faradj Koliev <faradj.g using gmail.com> wrote:

> Hi everyone,
>
> I am trying to generate a conditional dummy variable ”X" with the
> following rules
>
>  set X=1 if Y is =1, two years prior to the NA.  [0,0,NA].
>
> For example, if  the pattern for Y is 0,0,NA then the X variable is =0 for
> all  the two years prior to the NA. If the pattern for Y is 0,1,NA or
> 1,0,NA then the X =1 . To be clear, if 1,1,NA then the X=1 that  first
> specific year, it should only count once (X=1), not twice.
>
> The code that I have now is not complete and I would appreciate some
> advice here. This is the code:
> dat2 <- dat1 %>%
>   group_by(country) %>%
>   group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>%
>   mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3)
> == 1L),
>          X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x}) %>%
>   ungroup()
>
> It doesn’t really generate what I described above. Any help here would be
> much appreciated.
>
> Below you can see my sample data with the desired outcome ”X” dummy in it.
>
> Thank you!
>
> > dput(data)
> structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L,
> 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L,
> 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L,
> 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L,
> 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L,
> 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
> 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
> 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L,
> 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L,
> 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L,
> 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
> 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
> 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Canada",
> "Cuba", "Dominican Republic", "Haiti", "Jamaica"), class = "factor"),
>     Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>     1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA,
>     1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA,
>     NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L,
>     0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L,
>     NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>     0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L,
>     NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
>     0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L,
>     0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L,
>     0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>     1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names =
> c("year",
> "country", "Y", "X"), class = "data.frame", row.names = c(NA,
> -110L))
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list