[R] R code for if-then-do code blocks

Paul Miller pjmiller_57 @ending from y@hoo@com
Wed Dec 19 18:28:35 CET 2018


Hi Gabor, Richard, and Thierry, 

Thanks very much for your replies. Turns out I had already hit on Gabor's idea of "factor out" in writing an initial draft of the code converting from SAS to R. Below is the link Gabor sent describing this and other approaches. 

https://stackoverflow.com/questions/34096162/dplyr-mutate-replace-on-a-subset-of-rows/34096575#34096575

At the end of this email are some new test data plus a snippet of my initial R code. The R code I have replicates the result from SAS but is quite verbose. That should be obvious from the snippet. I know I can make the code less verbose with a subsequent draft but wonder if I can simplify to the point where the factor out approach gets a fair test. I'd appreciate it if people could share some ways to make the factor out approach less verbose. I'd also like to see how well some of the other approaches might work with these data. I spent considerable time looking at the link Gabor sent as well as the other responses I received. The mutate_cond function in the link seems promising but it wasn't clear to me how I could avoid having to repeat the various conditions using that approach. 

Thanks again.

Paul

library(magrittr)
library(dplyr)
 
test_data <-
  structure(
    list(
      intPatientId = c("3", "37", "48", "6", "6", "5"),
      intSurveySessionId = c(1L, 10996L, 19264L, 2841L, 28L, 34897L),
      a_CCMA02 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_),
      a_CCMA69 = c(7, NA, 0, 2, NA, 0),
      a_CCMA70 = c(7, 0, NA, 10, NA, NA),
      a_CCMA72 = c(7, 2, 3, NA, NA, NA),
      CCMA2 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_),
      a_CCMA05 = c(NA, NA, NA, NA, NA, 0),
      a_CCMA43 = c(5, 0, 6, 5, NA, NA),
      a_CCMA44 = c(5, 0, 0, 5, 0, NA),
      CCMA5 = c(NA, NA, NA, NA, NA, 0)
    ),
    class = "data.frame",
    row.names = c(NA,-6L)
  )

factor_out <- test_data %>%
  mutate(
    CCMA2_cond = case_when(
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (!is.na(a_CCMA69) & between(a_CCMA69, 0, 10) &
           !is.na(a_CCMA70) & between(a_CCMA70, 0, 10) &
           !is.na(a_CCMA72) & between(a_CCMA72, 0, 10)) ~ "A",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA69) | a_CCMA69 < 0 | a_CCMA69 >= 10) &
        !is.na(a_CCMA70) & between(a_CCMA70, 0, 10) &
        !is.na(a_CCMA72) & between(a_CCMA72, 0, 10) ~ "B",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA70) | a_CCMA70 < 0 | a_CCMA70 >= 10) &
        between(a_CCMA69, 0, 10) & between(a_CCMA72, 0, 10) ~ "C",
      (is.na(a_CCMA02) | a_CCMA02 < 0 | a_CCMA02 > 10) &
        (is.na(a_CCMA72) | a_CCMA72 < 0 | a_CCMA72 >= 10) &
        between(a_CCMA69, 0, 10) & between(a_CCMA70, 0, 10) ~ "D")
  ) %>%
  mutate(
    CCMA2 = case_when(
      CCMA2_cond == "A" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "A" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "A" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "B" & 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "B" & 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "B" ~ 0.614 + (0.065 * (a_CCMA70 + a_CCMA72) / 2) + (-0.012 * a_CCMA70) + (0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "C" & 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + a_CCMA69) / 2 ) + (0.504 * a_CCMA72) < 0  ~ 0,
      CCMA2_cond == "C" & 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + a_CCMA69) / 2 ) + (0.504 * a_CCMA72) > 10 ~ 10,
      CCMA2_cond == "C" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 *(a_CCMA72 + a_CCMA69) / 2 ) + (0.504 * a_CCMA72),
      TRUE ~ as.double(CCMA2)
    ),
    CCMA2 = case_when(
      CCMA2_cond == "D" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + (0.504 *(a_CCMA70 + a_CCMA69) / 2) < 0  ~ 0,
      CCMA2_cond == "D" & 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + (0.504 *(a_CCMA70 + a_CCMA69) / 2) > 10 ~ 10,
      CCMA2_cond == "D" ~ 0.614 + (0.065 * a_CCMA69) + (-0.012 * a_CCMA70 ) + (0.504 *(a_CCMA70 + a_CCMA69) / 2),
      TRUE ~ as.double(CCMA2)
    )
  ) %>%
  select(-CCMA2_cond) %>%
  mutate(
    CCMA5_condA = if_else(
      (is.na(a_CCMA05) | a_CCMA05 < 0 | a_CCMA05 > 10),
      1, 0
    ),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA43, 0, 10) & between(a_CCMA44, 0, 10),
                   0.216 + (0.257 * a_CCMA43) + (0.828 * a_CCMA44),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA43, 0, 10) & (is.na(a_CCMA44) | a_CCMA44 < 0 | a_CCMA44 > 10),
                   0.216 + (0.257 * a_CCMA43) + (0.828 * a_CCMA43),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & between(a_CCMA44, 0, 10) & (is.na(a_CCMA43) | a_CCMA43 < 0 | a_CCMA43 > 10),
                   0.216 + (0.257 * a_CCMA44) + (0.828 * a_CCMA44),
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & !is.na(CCMA5) & CCMA5 < 0,
                   0,
                   CCMA5),
    CCMA5 = ifelse(CCMA5_condA == 1 & CCMA5 > 10,
                   10,
                   CCMA5)
  ) %>%
  select(-CCMA5_condA)



More information about the R-help mailing list