[R] Converting a data set from 'long' format to 'interval' format

Henrique Dallazuanna wwwhsd at gmail.com
Wed Mar 24 21:47:30 CET 2010


Try this:

foo <- function(x) {
		data.frame(Id = unique(x$Id), rbind(range(x$time)), dose = unique(x$dose))
}

t(sapply(split(as.data.frame(orig.data),
               with(rle(orig.data[,'dose']), rep(seq_along(lengths),
lengths))),
	   foo))

On Wed, Mar 24, 2010 at 12:27 PM, Marie-Pierre Sylvestre
<mp.sylvestre at gmail.com> wrote:
> Hi,
>
> I have a data set in which the variable 'dose' is time-varying. Currently,
> the data set is in a long format, with 1 row for each time unit of follow-up
> for each individual "Id". It looks like this:
>
>
> orig.data <- cbind(Id = c(rep(1,4), rep(2,5)), time = c(1:4, 1:5), dose =
> c(1,1,1,0,1,0,1,1,0))
>
> orig.data
>      Id time dose
>  [1,]  1    1    1
>  [2,]  1    2    1
>  [3,]  1    3    1
>  [4,]  1    4    0
>  [5,]  2    1    1
>  [6,]  2    2    0
>  [7,]  2    3    1
>  [8,]  2    4    1
>  [9,]  2    5    0
>
> What I would like to do is to convert the data set into an interval format.
> By that I mean a data set in which each row has a 'Start' and a 'Stop' value
> that indicates the time units in which the 'dose' is constant. For example,
> my orig.data example would now be:
>
> int.data <-  cbind(Id = c(rep(1,2), rep(2,4)), Start = c(1,4,1,2,3,5), Stop
> = c(3,4,1,2,4,5), dose = c(1,0,1,0,1,0))
>
> int.data
>     Id Start Stop dose
> [1,]  1     1    3    1
> [2,]  1     4    4    0
> [3,]  2     1    1    1
> [4,]  2     2    2    0
> [5,]  2     3    4    1
> [6,]  2     5    5    0
>
> Basically, this implies collapsing rows that have the same "Id" and "dose"
> and creating "Start" and "Stop" to index the time.
>
> While I can write a clumsy routine with multiple loops to do it, it will be
> inefficient and will not work for large data set.
>
> I wonder if people know of a function that would reshape my data set from
> 'long' to 'interval'?
>
> Best,
>
> MP
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list