[R] within group sequential subtraction

jim holtman jholtman at gmail.com
Thu Mar 10 23:27:26 CET 2011


Try this:

> data$diff <- ave(as.numeric(data$date_obs), data$group, FUN=function(x)c(NA, diff(x)))
> data
   group   date_obs diff
1   IND1 1987-09-17   NA
2   IND1 1989-05-04  595
3   IND2 1997-04-30   NA
4   IND2 2008-11-03 4205
5   IND2 2009-05-08  186
6   IND3 1984-01-17   NA
7   IND4 1996-09-28   NA
8   IND5 2000-07-30   NA
9   IND6 1998-01-17   NA
10  IND6 1999-02-25  404
>


On Thu, Mar 10, 2011 at 9:56 AM, natalie.vanzuydam <nvanzuydam at gmail.com> wrote:
> Hi Everyone,
>
> I would like to do sequential subtractions within a group so that I know the
> time between separate observations for a group of individuals.
>
> My data:
>
> data <- structure(list(group = c("IND1", "IND1", "IND2",
> "IND2", "IND2", "IND3", "IND4", "IND5",
> "IND6", "IND6"), date_obs = structure(c(6468,
> 7063, 9981, 14186, 14372, 5129, 9767, 11168, 10243, 10647), class =
> "Date")), .Names = c("group",
> "date_obs"), row.names = c(NA, 10L), class = "data.frame")
>
> So I start with:
>
>  group   date_obs
> 1   IND1 1987-09-17
> 2   IND1 1989-05-04
> 3   IND2 1997-04-30
> 4   IND2 2008-11-03
> 5   IND2 2009-05-08
> 6   IND3 1984-01-17
> 7   IND4 1996-09-28
> 8   IND5 2000-07-30
> 9   IND6 1998-01-17
> 10  IND6 1999-02-25
>
> what I would like:
>
>  group   date_obs     time
> 1   IND1 1987-09-17 NA
> 2   IND1 1989-05-04 595
> 3   IND2 1997-04-30 NA
> 4   IND2 2008-11-03 4205
> 5   IND2 2009-05-08 186
> 6   IND3 1984-01-17 NA
> 7   IND4 1996-09-28 NA
> 8   IND5 2000-07-30 NA
> 9   IND6 1998-01-17 NA
> 10  IND6 1999-02-25 404
>
> So that if there is one entry/individual a 0/NA would be acceptable and if
> there is more than one entry/individual the sequential difference would be
> calculated.
>
> I started with some code but it I cannot edit it appropriately.
>
> x <- do.call(rbind, lapply(split(data, data$group),
>        function(dat) {
>                        dat <- dat[order(dat$date_obs), ]
>                        d<-diff(dat$date_obs)
>                         dat <- rbind(dat,d)
>                        }))
>
> I get this error: "Error in as.Date.numeric(value) : 'origin' must be
> supplied" so I'm not sure if it does what I need it to do.  In addition to
> this the vector lengths won't match up as the first date in the sequence
> won't be subtracted from itself.
>
> I'm not sure if anyone knows an easier way to achieve this.
>
> Thanks for the help,
> Natalie
>
>
>
>
> -----
> Natalie Van Zuydam
>
> PhD Student
> University of Dundee
> nvanzuydam at dundee.ac.uk
> --
> View this message in context: http://r.789695.n4.nabble.com/within-group-sequential-subtraction-tp3346033p3346033.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list