[R] identify time span in date vector

Petr PIKAL petr.pikal at precheza.cz
Wed Apr 4 14:19:17 CEST 2012


Hi

> 
> Dear Petr,
> 
> thanks for taking your time. 
> 
> For this input, the first element should be selected since there are 
more 
> than 3 more dates within one year (basically, all other dates are within 

> one year) and at least one of them is more than 3 month later.
> 
> In the meantime, I came up with some code (probably) doing what I want:
> 
> identify_first_date = function(dates)
> {
> within_one_year = as.matrix(dist(dates)) < 366                  ### next 

> dates in same year?
> within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE
> 
> within_one_month = as.matrix(dist(dates)) < 91                ### next 
> dates within 90 days?
> within_one_month[upper.tri(within_one_month, diag=TRUE)]=FALSE
> 
> dates[
>    which(
>    apply(within_one_year,2,sum) > apply(within_one_month,2,sum) & 
> ### more dates in one year than in one month
>    apply(within_one_year,2,sum) >=3                   ### more than 4 
> dates in one year
>    )[1]]
> }
> 
> I guess, the code could be improved, though, it takes some time.

Your first condition can be fulfilled by

c(as.numeric(diff(dates))<365, F) > c(as.numeric(diff(dates))<91,F))

so if you put in your function

identify_first_date2 = function(dates)
{
within_one_year = as.matrix(dist(dates)) < 366
within_one_year[upper.tri(within_one_year, diag=TRUE)]=FALSE

distance<-as.numeric(diff(dates))

dates[ which( c(distance<365, F) > c(distance<91,F) & 
apply(within_one_year,2,sum) >=3)[1]]
}

You shall get some improvement, however I am still struggling to evaluate 
how many consecutive dates are within one year.




> 
> Best,
> Felix
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Petr PIKAL [mailto:petr.pikal at precheza.cz] 
> Gesendet: Mittwoch, 4. April 2012 09:47
> An: Fischer, Felix
> Cc: r-help at r-project.org
> Betreff: Odp: [R] identify time span in date vector
> 
> Hi
> 
> Can you please be more specific? Based on this input, what do you want 
as a result?
> 
> > set.seed(111)
> > dates = as.Date(sort(rnorm(10,3000,100)), origin = "2000-1-1") dates
>  [1] "2007-08-01" "2007-10-21" "2007-12-08" "2007-12-15" "2008-01-29" 
> "2008-02-14" "2008-02-16" "2008-03-01"
>  [9] "2008-04-02" "2008-04-11"
> >
> 
> Regards
> Petr
> 
> > 
> > Hello everyone,
> > 
> > i try to identify the first element of a date vector, for which the 
> > following condition holds: at least 3 more dates within the next 365
> days,
> > but at least one of these must be between 3-12 month later.
> > 
> > dates = as.Date(sort(rnorm(10,3000,100)), origin = "2000-1-1")
> > 
> > Has anyone an idea how to do this economically? I'll need to apply 
> > this
> to
> > a large dataset with date vectors of various lengths and I can think
> only 
> > of quite difficult algorithms :(
> > 
> > Any ideas would be appreciated,
> > Felix
> > 
> > 
> >    [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list