Wed May 12 22:58:47 CEST 2010
Thanks Gabor, this looks like it serves my needs. I've extended the code to work with an example where we have two multicolumn zoo objects, one with the original data and another that has the growth rates.
# mat1 = zoo object to extend
# mat2 = zoo object whose growth rate is used to extend mat1
mergeGrowth <- function(mat1, mat2)
{
ix <- is.na(mat1)
mat1.locf <- na.locf(mat1, na.rm=F)
mat2.locf <- mat2
mat2.locf[ix] <- NA
mat2.locf <- na.locf(mat2.locf, na.rm=F)
coredata(mat1)[ix] <- coredata(mat1.locf * mat2 / mat2.locf)[ix]
mat1
}
Abiel Reinhart
Yes, that is what it does. Note that na.approx interpolates and does
not work precisely as you discussed but its easy, does use m[,2] and
may be good enough. If you really do want something precisely as you
discussed try this. It NAs out the rows of m for which column 1 is NA
and then uses na.locf to move the prior non-NA into it. Then we apply
the formula:
> library(zoo)
> m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, NA), seq(7)^2), as.Date(1:7))
>
> # mm will hold result; m.locf
> m.locf <- mm <- m
> ix <- is.na(mm[,1])
> m.locf[ix,] <- NA
> m.locf <- na.locf(m.locf)
> mm[ix, 1] <- m.locf[ix, 1] * mm[ix,2] / m.locf[ix,2]
> mm
1970-01-02 1.0 1
1970-01-03 2.0 4
1970-01-04 4.5 9
1970-01-05 8.0 16
1970-01-06 5.0 25
1970-01-07 7.2 36
1970-01-08 9.8 49
Yes, that is what it does. Please read the help file for na.approx and
approx. If you want something different you will have to special case
the end values.
> Maybe I am doing this wrong, but rule=2 does not look like it is growing the series out, but rather just carrying the last value forward. It looks like na.approx() followed by na.locf(). For instance:
>
> m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, NA), seq(7)^2), as.Date(1:7))
> na.approx(m[, 1], x = m[, 2], rule=2)
>
> 1970-01-02 1970-01-03 1970-01-04 1970-01-05 1970-01-06 1970-01-07 1970-01-08
> 1.0 2.0 2.7 3.7 5.0 5.0 5.0
>
> Abiel Reinhart
>
>
Use rule = 2 as in the extrapolation examples in the na.approx help file.
>
>> This comes close to solving my problem, but I am still left with the problem of how I can extrapolate, not just interpolate. In our example, if I define m as,
>>
>> m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, NA), seq(7)^2), as.Date(1:7))
>>
>> instead of
>>
>> m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, 7), seq(7)^2), as.Date(1:7))
>>
>> then I will only get five values back, when I really want m[,1] to be fully extrapolated so that there are seven values. Is there a workaround?
>>
>> Abiel Reinhart
>>
>>
>> Try this using the zoo package. See ?na.approx for more and note that
>> this functionality requires zoo 1.6-3 or later.
>>
>> .> m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, 7), seq(7)^2), as.Date(1:7))
>>> na.approx(m[, 1], x = m[, 2])
>> 1970-01-02 1970-01-03 1970-01-04 1970-01-05 1970-01-06 1970-01-07 1970-01-08
>> 1.000000 2.000000 2.714286 3.714286 5.000000 5.916667 7.000000
>>> na.approx(m[, 1])
>> 1970-01-02 1970-01-03 1970-01-04 1970-01-05 1970-01-06 1970-01-07 1970-01-08
>> 1 2 3 4 5 6 7
>>
>>
I have two identically sized matrices of data that represent time series (I am storing the data in zoo objects, but the idea should apply to any matrix of data). The time series in the second matrix extend further than in the first matrix, and I would like to use the data in matrix 2 to extrapolate the data in matrix 1. In other words, if mat1[i,j] == NA, then mat1[i,j] <- mat1[i-1, j]*mat2[i,j]/mat2[i-1,j]. Of course, before we can calculate mat1[i,j] we may need to calculate mat1[i-1,j], and that in turn may require the computation of mat1[i-2,j], etc. This could all clearly be done with loops, but I am wondering if anyone can think of a vectorized expression or other efficient method that would work.

Thanks very much.

Abiel Reinhart
>>>
>>> Thanks very much.
>>>
>>> Abiel Reinhart
