[R] Time series analysis

Rui Barradas ruipbarradas at sapo.pt
Thu May 9 16:01:29 CEST 2013


Hello,

Maybe the following will do it. Note, however, that in your data, for 
start day 2012-02-11, the end day is always 2012-02-12 so the time 
differences will be negative.



fun2 <- function(x){
	d <- numeric(nrow(x) - 1)
	for(i in seq_len(nrow(x))[-1]){
		start <- strptime(paste(x[i, 1], x[i, 2]), format = "%Y-%m-%d %H:%M:%S")
		end <- strptime(paste(x[i - 1, 3], x[i - 1, 4]), format = "%Y-%m-%d 
%H:%M:%S")
		dd <- difftime(start, end)
		if(attr(dd, "units") == "hours")
			d[i - 1] <- dd*60
		else if(attr(dd, "units") == "days")
			d[i - 1] <- dd*24*60
		else
			d[i - 1] <- dd
	}
	d
}

lapply(split(df, df[,1]), fun2)



Hope this helps,

Rui Barradas

Em 09-05-2013 13:44, Kai Mx escreveu:
> Hi Rui,
> thanks for the quick fix. I am still wrapping my mind around your
> expression, but unfortunately it doesn't quite give me what I want. You are
> calculating differences between the start times. However, I would like to
> know the 'idle' periods between the events, ie the time between the end of
> one event and the beginning of the next (but only for the events that start
> on the same day).
>
> Best,
>
> Kai
>
>
> On Thu, May 9, 2013 at 1:55 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>
>> Hello,
>>
>> If I understand it well, try the following.
>>
>>
>> tmp <- lapply(tapply(as.POSIXct(**paste(df[,1], df[,2])), df[,1], diff),
>> `*`, 60)
>> lapply(tmp, as.integer)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 09-05-2013 11:45, Kai Mx escreveu:
>>
>>> Hi everybody,
>>> I have an analysis problem that seems a little overwhelming to me, but is
>>> probably not too hard to solve for you guys. I have a (fairly large)
>>> dataframe that indicates usage of a resource on different days:
>>>
>>> df <-data.frame (
>>>     dstartday =c(rep('2012-02-10', 4), rep('2012-02-11', 5)),
>>>     dstart =c('08:05:00','09:35:00', '12:00:00','13:00:00', '07:50:00',
>>> '9:45:00', '13:00:00', '14:05:00', '15:50:00'),
>>>     dendday =c (rep('2012-02-10', 3), '2012-02-11',rep('2012-02-12', 5)),
>>>     dend = c ('08:35:00','09:40:00', '12:20:00', '01:00:00', '08:35:00',
>>> '11:00:00', '13:15:00', '15:00:00', '17:00:00')
>>> )
>>>
>>> Each row reflects an event that starts at the date and time that is
>>> indicated by dstartday/dstart and ends at dendday/dend.
>>> Now I would like to calculate the time intervals in minutes between the
>>> different events that start on a specific day, eg for '2012-02-10' it
>>> should be 60, 140, 40. The interval between the last event of the day and
>>> the first event of the next is not relevant and should be ignored. Events
>>> may run overnight, but there should not be any overlaps between start and
>>> end times.
>>> I have imported all the data as strings.
>>> Any thoughts and suggested readings/packages are really appreciated,
>>> thanks!
>>>
>>> Best,
>>>
>>> Kai
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________**________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>



More information about the R-help mailing list