[R] Corrected : Efficient writing of calculation involving each element of 2 data frames

jim holtman jholtman at gmail.com
Sat Feb 23 01:06:11 CET 2008


take a look at the 'embed' function.  With the you can create a matrix
with the added shifted in each column.  You would want to do
embed(your.data,100).

On Fri, Feb 22, 2008 at 4:15 PM, Vikas N Kumar
<vikasnkumar at users.sourceforge.net> wrote:
> Hi
>
> I have 2 data.frames each of the same number of rows (approximately 30000 or
> more entries).
> They also have the same number of columns, lets say 2.
> One column has the date, the other column has a double precision number. Let
> the column names be V1, V2.
>
> Now I want to calculate the correlation of the 2 sets of data, for the last
> 100 days for every day available in the data.frames.
>
> My code looks like this :
> # Let df1, and df2 be the 2 data frames with the required data
> ## begin code snippet
>
> my_corr <- c();
> for ( i_start in 100:nrow(df1))
>       my_corr[i_start-99] <-
> cor(x=df1[(i_start-99):i_start,"V2"],y=df2[(i_start-99):i_start,"V2"])
> ## end of code snippet
>
> This runs very slowly, and takes more than an hour to run if I have to
> calculate correlation between 10 data sets leaving me with 45 runs of this
> snippet or taking more than 30 minutes to run.
>
> Is there an efficient  way to write  this piece of code where I can get it
> to run faster ?
>
> If I do something similar in Excel, it is much faster. But I have to use R,
> since this is a part of a bigger program.
>
> Any help will be appreciated.
>
> Thanks and Regards
> Vikas
>
>
>
>
>
>
> --
> http://www.vikaskumar.org/
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list