[R] efficiency when processing ordered data frames
bkmooney at gmail.com
Wed May 20 14:54:28 CEST 2009
Hoping for a little insight into how to make sure I have R running as
efficiently as possible.
Suppose I have a data frame, A, with n rows and m columns, where col1
is a date time stamp. Also suppose that when this data is imported
(from a csv or SQL), that the data is already sorted such that the
time stamp in col1 is in ascending (or descending) order.
If I then wanted to select only the rows of A where col1 <= a certain
time, I am wondering if R has to read through the entirety of col1 to
select those rows (all n of them). Is it possible for R to recognize
(or somehow be told) that these rows are already in order, thus
allowing the computation could be completed in ~log(n) row reads
More information about the R-help