[R] Missing data?

Gabor Grothendieck ggrothendieck at gmail.com
Sun Nov 27 23:23:42 CET 2011


On Sun, Nov 27, 2011 at 4:08 PM, Kevin Burton <rkevinburton at charter.net> wrote:
> I admit it isnt reality but I was hoping through judicious use of these functions I could approximate reality. For example in the years where there are more than 53 weeks in a year I would be happy if there were a way to recognize this and drop the last week of data. If there were less than 53 I would "pad" the year with an extra dummy week. This is just about the same as your suggestion of putting more than 7 days in the first and last weeks. But i still need this kind of date manipulation to even know how many days to add in to make the approximation viable. This kind of best approximation to reality seems better than to settle for the resolution of a month just because it is consistent. Daily would be too much data and even then there would be an approximation due to leap years.
>

OK. As you are willing to regard days past the 364th as part of the
last week of the year then we can do this.

Create a zoo object z as test data.   Then convert its time scale to
year + week/52 where 0 is the first week of the year and we replace
any week that is greater than 51 with 51.  Then we aggregate z by week
taking the last data point in the week and convert it to ts.  Because
of the way we constructed it the frequency will be 52.

library(zoo)

# test data
z <- zoo(1:100, Sys.Date() + 1:100)

yr.wk <- with(as.POSIXlt(time(z)), year + 1900 + pmin(yday %/% 7, 51) / 52)
z.wk <- aggregate(z, yr.wk, tail, 1)
z.ts <- as.ts(z.wk)

frequency(z.ts) # 52

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list