[R] Missing data?

Gabor Grothendieck ggrothendieck at gmail.com
Sat Nov 26 22:13:05 CET 2011


On Tue, Nov 22, 2011 at 6:50 PM, Kevin Burton <rkevinburton at charter.net> wrote:
> Void of any other suggestions this approach makes sense but for my case I
> think I need to use zoo objects rather than xts. If I sequence the data
> generally I don't know if there will be 365 days in the year or 366. So I
> have to sequence the dates as:
>
> seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"), by="day")
>
> If I use this sequence with xts I get:
>
>> ds <- xts(NA, seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"),
> by="day"))
> Error in xts(NA, seq(from = as.Date("2011-01-01"), to =
> as.Date("2011-12-31"),  :
>  NROW(x) must match length(order.by)
>
> If I leave the 'data' empty I don't get the error but if I try to assign an
> individual item (fill as appropriate)
>
>> ds <- xts(, seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"),
> by="day"))
>> ds["2011-12-24"] <- 10
>> ds
> Error in structure(coredata(x), names = x.attr$dimnames[[1]]) :
>  'names' attribute [365] must be the same length as the vector [358]
>
> So now I need to remember that I have not filled in all of the data. Also
> simple dereferencing gives:
>
>> ds[1]
> Error in `[.xts`(ds, 1) : subscript out of bounds
>
> With zoo I am able to create a time-series where all of the data is
> initially NA:
>
>> ds <- zoo(NA, seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"),
> by="day"))
>
> So I can fill the data as appropriate and the remaining slots will have NA.
> I may be new with xts but I cannot see a way of creating a useable 'blank'
> time-series.
>
> Also with xts it seems like the frequency is ignored.
>
>> ds <- xts(1:365, seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"),
> by="day"), frequency=52)
>> frequency(ds)
> [1] 1
>
> Whereas zoo remembers the frequency setting
>
>> ds <- zoo(1:365, seq(from=as.Date("2011-01-01"), to=as.Date("2011-12-31"),
> by="day"), frequency=52)
>> frequency(ds)
> [1] 52
>
> But since the ultimate goal is to get the time-series in a 'ts' format (as
> many functions require 'ts') it seems like even zoo has problems:

The problem is that you seem to want a fixed number of periods per
year but there is not a constant of 52 weeks nor 365 days in a year.
You are going to have give up something since your apparent criteria
conflict with reality.  For example, you could use months in which
case there are exactly 12 or you could stick more than 7 days into the
first or last week of the year so that there are exactly 52 weeks in a
year but they don't all have the same number of days, etc.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list