[R] Am I working with regularly spaced time series?

Gabor Grothendieck ggrothendieck at gmail.com
Tue Oct 22 17:47:23 CEST 2013


On Tue, Oct 22, 2013 at 11:37 AM, Paul Gilbert <pgilbert902 at gmail.com> wrote:
>
>
> On 13-10-22 06:00 AM, Weiwu Zhang <zhangweiwu at realss.com> wrote:
>>
>> My data is sampled once per minute.
>
>
> At the same second each minute or not? Regularly spaced would mean exactly
> one minute between observations.
>
>
> There are invalid samples, leaving
>>
>> a lot of holes in the samples, successful sample is around 80% of all
>> minutes in a day. and during the last 4 months sampling, one month's
>> data was stored on a harddisk that failed, leaving a month's gap in
>> between.
>
>
> This is called "missing observations". With regular spacing you need to fill
> in the holes with NA. With irregular spacing you can either drop the missing
> observations or, if you know the time at which they were missed, you could
> fill in with NA.
>
>
>>
>> So am I working with regularly spaced time series or not? Should I
>> padd all missing data with NAs, and start with ts(), and followed by
>> forecast package (which seems to have all the functions I need in the
>> begining) or should I start with a library with irregular time series
>> in mind?
>>
>> Also, ts() manual didn't say how to create time-series with one minute
>> as daltat. Its seems to assume time-series is about dates. So the data
>> I have with me, is it really time series at all?
>
>
> ts() representations works best with regularly spaced monthly, quarterly, or
> annual data. You can use it for other things if they fit nicely into the
> regular spaced observations with a frequency of observation, such as 12
> times per year or 60 times per hour. This usually only makes sense if the
> frequency has something to do with your problem, like seasonality questions.
> You can also use frequency 1 for one observation per period, like annual
> data, which in your case would be once per minute. I'm inclined to think
> that a zoo (see package zoo) represenation would fit your problem better.
>

Also note that the zoo package has two classes:

1. zoo for irregularly spaced series
2. zooreg for series with an underlying regularity but for which some
of the points are missing (which seems to be the situation under
discussion)

The two classes are nearly the same but zooreg series have a frequency
and some methods act differently -- most notably lag and diff.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list