[R] Looping through data tables (or data frames) by removing previous individuals

Ista Zahn istazahn at gmail.com
Mon Oct 3 21:34:32 CEST 2016


Hi Frank,

How about

library(lubridate)
dtf <- merge(dt, expand.grid(id = dt$id, refdate = v), by = "id")
dtf[, gt65 := as.period(interval(fborn, refdate), unit = "years") > years(65)]
dtf <- dtf[gt65 == TRUE,][, .SD[refdate == min(refdate)], by = id]

Best,
Ista

On Mon, Oct 3, 2016 at 1:17 PM, Frank S. <f_j_rod at hotmail.com> wrote:
> Dear R users,
>
> With this mail I send my third and last question I wanted to ask these days. First of all, many thanks
>
> for the received support in my previous mails! My question is this: Starting from a series of (for example)
>
> "k" different dates (all contained in vector "v"), I want to get a list of "k" data tables (or data frames) so
>
> that each contains those individuals who for the first time are at least 65, looping on each of the dates of
>
> vector "v". Let's consider the following example with 5 individuals:
>
>
> dt <- data.table(
>    id = 1:5,
>    fborn = as.Date(c("1935-07-25", "1942-10-05", "1942-09-07", "1943-09-07", "1943-12-31")),
>    sex = as.factor(rep(c(0, 1), c(2, 3)))
>    )
>
> v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by ="year") # k=4
>
>
> I would expect to obtain k=4 data tables so that:
> dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1])
> dt_p2: is NULL (no subject reach for the first time 65 on date v[2])
> dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on v[3])
> dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on v[4])
>
>
> I have tried:
>
> dt_p <- list( )                        # Empty list to alocate data tables
>
> for (i in 1:length(v)) {
>   dt_p[[i]] <- dt[ !(id %in% dt_p[[1:(i-1)]]$id) &  # Remove subjects from previous dt_p's
>          round((v[i] - fborn)/365.25, 2) >= 65, ][ , list(id, fborn, sex)]
>
>  dt.names <- paste0("dt_p", 1:length(v))
>  assign(dt.names[i], dt_p[[i]])         # Assign a name to each data table
>  }
>
> However, I cannot express correctly the previous data tables, because for the first data
>
> table in the loop, there are not any previous. Consequently, I get an error message:
>
> # Error in dt_p[[1:(i - 1)]] : no such index at level 1
>
>
> I would be very grateful for anu suggestion!
>
> Frank S.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list