[R] error msg using na.approx "x and index must have the same length"

R. Michael Weylandt michael.weylandt at gmail.com
Sun Oct 14 17:46:53 CEST 2012


On Fri, Oct 12, 2012 at 1:26 AM, Jay Rice <jsrice18 at gmail.com> wrote:
> Below I have written out some simplified data from my dataset. My goal is
> to interpolate Price based on timestamp. Therefore the closer a Price is in
> time to another price, the more like that price it will be. I want the
> interpolations for each St and not across St (St  is a factor with levels
> A, B, and C). Unfortunately, I get error messages from code I wrote.
>
> In the end only IDs 10 and 14 will receive interpolated values because all
> other NAs occur at the beginning of a level.  My code is given below the
> dataset.
>
> ID is int
> St is factor with 3 levels
> timestamp is POSIXlt
> Price is num
>
> Data.frame name is portfolio
>
> ID   St               timestamp                   Price
> 1     A    2012-01-01 12:50:24.760      NA
> 2     A    2012-01-01 12:51:25.860   72.09
> 3     A    2012-01-01 12:52:21.613   72.09
> 4     A    2012-01-01 12:52:42.010   75.30
> 5     A    2012-01-01 12:52:42.113   75.30
> 6     B    2012-01-01 12:56:20.893       NA
> 7     B    2012-01-01 12:56:46.023    67.70
> 8     B    2012-01-01 12:57:19.300    76.06
> 9     B    2012-01-01 12:58:20.750    77.85
> 10   B    2012-01-01 12:58:20.797      NA
> 11   B    2012-01-01 12:59:19.527    79.57
> 12   C    2012-01-01 13:00:21.847    81.53
> 13   C    2012-01-01 13:00:21.860    81.53
> 14   C    2012-01-01 13:00:21.873       NA
> 15   C    2012-01-01 13:00:43.493    84.69
> 16   D    2012-01-01 12:01:21.520    24.63
> 17   D    2012-01-01 12:02:18.880    21.13
>
> I tried the following using na.approx from zoo package
>
> interpolatedPrice<-unlist(tapply(portfolio$Price, portfolio$St, na.approx,
> portfolio$timestamp, na.rm=FALSE))

Your problem is that this splits portfolio$Price by St but not
timestamp, so the number of timestamps passed to na.approx() doesn't
align with the Price series. I think you want something more like
this:

lapply(split(portfolio[,-1], portfolio$St), function(x)
zoo(na.approx(x[,2], x[,1]), x[,1]))

which is admittedly opaque.

I think an easier data management strategy for you might be to put
your data in a list of zoo/xts series and use lapply generously.

E.g.,

pp <- lapply(split(portfolio[,-1], portfolio$St), as.zoo)

and then do your calculations with generous use of lapply()

Cheers,
Michael


>
> but keep getting error
> "Error in na.approx.default(X[[1L]], ...) :
>   x and index must have the same length"
>
> I checked the length of every variable in the formula and they all have the
> same length so I am not sure why I get the error message.
>
> Jay
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list