[R] Aggragating subsets of data in larger vector with sapply

Mon Jan 10 01:10:38 CET 2011

Have 40,000 rows of buy/sell trade data and am trying to add up the buys for
each second, the code works but it is very slow.  Any suggestions how to
improve the sapply function ?

secEP = endpoints(xSym$Direction, "secs")  # vector of last second on an XTS
timeseries object with multiple entries for each second.
d = xSym$Direction
s = xSym$Size
buySize = sapply(1:(length(secEP)-1), function(y) { 
	i =  (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP
	return(sum(as.numeric(s[i][d[i] == "buy"])));
} )	

Object details:

secEP = numeric Vector of one second Endpoints in xSym$Direction. 

> head(xSym$Direction)
                    Direction
2011-01-05 09:30:00 "unkn"   
2011-01-05 09:30:02 "sell"   
2011-01-05 09:30:02 "buy"    
2011-01-05 09:30:04 "buy"    
2011-01-05 09:30:04 "buy"    
2011-01-05 09:30:04 "buy" 

> head(xSym$Size)
                    Size  
2011-01-05 09:30:00 " 865"
2011-01-05 09:30:02 " 100"
2011-01-05 09:30:02 " 100"
2011-01-05 09:30:04 " 100"
2011-01-05 09:30:04 " 100"
2011-01-05 09:30:04 "  41"

Thanks,
Chris

-- 
View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html
Sent from the R help mailing list archive at Nabble.com.