[R] alternative for multiple if_else statements

Kevin Wamae KWamae at kemri-wellcome.org
Fri Feb 23 06:11:20 CET 2018


Dear Ista, thank you. Let me see how best I can implement this.

Regards
------------------
Kevin Wamae

On 22/02/2018, 16:58, "Ista Zahn" <istazahn at gmail.com> wrote:

    I don't fully understand the logic you are trying to implement, but
    something along the lines of
    
    foo <- cut(trialData$date,
               breaks = as.Date(c("2007-01-01",
                                  "2008-05-01",
                                  "2009-04-01",
                                  "2010-05-01",
                                  "2011-05-01",
                                  "2012-04-01",
                                  "2013-04-01",
                                  "2014-04-01",
                                  "2015-04-01",
                                  "2016-03-01",
                                  "2017-01-01")))
    
    might work.
    
    Best,
    Ista
    
    On Wed, Feb 21, 2018 at 3:33 PM, Kevin Wamae <KWamae at kemri-wellcome.org> wrote:
    > Hi, I am having trouble trying to figure out why if_else is behaving the way it is, it may be my code or the way the data is structured.
    >
    > Below is a snapshot of a database am working on and it represents a longitudinal survey of study participants in a trial with weekly follow up.
    >
    > The variable "survey_start" represents the start of the study-defined one year follow up (which we called "survey_year").
    >
    > I am trying to populate all subsequent entries for each participant, per survey year, with the entry "survey" followed by an underscore and the respective year, eg. survey_2014.
    >
    > There are missing entries such as the participant represented here, wasn't available at the start of the 2015 survey. Also, some participants don’t have complete one-year follow ups but I still need to include them.
    >
    > I have written two codes, first one fails while the second works, the only difference being I have reversed the order in which the entries are populated in the second code (from 2007-2016 to 2016-2007) and removed the if_else statement for 2015. Also noticed, that for the second code, which spans the years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the code fails.
    >
    > Kindly assist in figuring this out...or better yet, an alternative.
    >
    >     trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
    > "site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
    > "child_1", "child_1"), date = structure(c(16078, 16085, 16092,
    > 16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
    > 16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
    > 16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
    > 16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
    > 16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
    > 16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
    > 16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
    > 16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
    > 16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
    > 16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
    > 16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
    > 16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
    > 16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
    > 16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
    > 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
    > 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
    > 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
    > 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
    > 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
    > 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
    > 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
    > 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
    > 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
    > 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
    > 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
    > 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
    > 12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
    > 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
    > 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
    > 11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
    > 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
    > 6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
    > "", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
    > -125L), .Names = c("study", "studyno", "date", "year", "month",
    > "survey_start"))
    >
    >
    > code 1 fails:
    >
    > trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
    > mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",
    >                      if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
    >                      if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
    >                      if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
    >                      if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
    >                      if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
    >                      if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
    >                      if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
    >                      if_else(date >= date[survey_start == "Y" & year == 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 & study == "site_1"][1], "survey_2015",
    >                      if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1], "survey_2016","")))))))))))
    >
    > code 2 works:
    >
    >     trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
    >   mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1]                                                               , "survey_2016",
    >                            if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
    >                            if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
    >                            if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
    >                            if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
    >                            if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
    >                            if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
    >                            if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
    >                            if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",""))))))))))
    >
    > ______________________________________________________________________
    >
    > This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system.  Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the  accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
    > ______________________________________________________________________
    >
    >         [[alternative HTML version deleted]]
    >
    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.
    


______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system.  Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the  accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________


More information about the R-help mailing list