[R] Aggragating subsets of data in larger vector with sapply

Joshua Ulrich josh.m.ulrich at gmail.com
Wed Jan 12 05:35:48 CET 2011


Hi Chris,

This seems to work on the sample data you provided.

FUN <- function(x) {
  x <- xts(as.numeric(x),index(x))
  period.apply(x, endpoints(x,"secs"), sum)
}
lapply(split.default(xSym$Size,xSym$Direction), FUN)

Best,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com



On Sun, Jan 9, 2011 at 6:10 PM, rivercode <aquanyc at gmail.com> wrote:
>
>
> Have 40,000 rows of buy/sell trade data and am trying to add up the buys for
> each second, the code works but it is very slow.  Any suggestions how to
> improve the sapply function ?
>
> secEP = endpoints(xSym$Direction, "secs")  # vector of last second on an XTS
> timeseries object with multiple entries for each second.
> d = xSym$Direction
> s = xSym$Size
> buySize = sapply(1:(length(secEP)-1), function(y) {
>        i =  (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP
>        return(sum(as.numeric(s[i][d[i] == "buy"])));
> } )
>
> Object details:
>
> secEP = numeric Vector of one second Endpoints in xSym$Direction.
>
>> head(xSym$Direction)
>                    Direction
> 2011-01-05 09:30:00 "unkn"
> 2011-01-05 09:30:02 "sell"
> 2011-01-05 09:30:02 "buy"
> 2011-01-05 09:30:04 "buy"
> 2011-01-05 09:30:04 "buy"
> 2011-01-05 09:30:04 "buy"
>
>> head(xSym$Size)
>                    Size
> 2011-01-05 09:30:00 " 865"
> 2011-01-05 09:30:02 " 100"
> 2011-01-05 09:30:02 " 100"
> 2011-01-05 09:30:04 " 100"
> 2011-01-05 09:30:04 " 100"
> 2011-01-05 09:30:04 "  41"
>
> Thanks,
> Chris
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list