[R] recursive beta with cutoffs on large data set

Dirk Eddelbuettel edd at debian.org
Sun Jun 15 19:24:54 CEST 2008


On 15 June 2008 at 12:50, ivo welch wrote:
| dear R experts:  I have an academic question that borders on asking
| for consulting help, so I hope I am not too imposing.  If I am, please
| ignore me.
| My data set has 100MB data set of daily stock returns.  I want to
| compute rolling (recursive?) betas---either bivariate or
| multivariate---with respect to some other data time series.  Many of
| these regressions are "take away the first observation, add one
| observation at the end," which means I really have only about 30,000
| unique regressions---still, quite a good number.   Worse, I want to
| winsorize the rolling y-vector at different levels (99%&1%, 98%&2%,
| ...), so I want to repeat this procedure a few hundred times at
| different winsorization levels.
| The most important version of my task is bivariate regressions, which
| may mean that I don't even need MV overhead.
| I was even thinking of coding in C rather than R for speed sake, but I
| am now thinking that learning the intricacies of fast vector
| processing on x86 processors is so difficult, I would be done running
| in R faster before I would be done programming it.
| Has anyone done something like this?  Any recommendations for what
| could help give me high-speed the I probably need for a task like
| this?  Any thoughts?


which says

    'lm' calls the lower level functions 'lm.fit', etc, see below, for
    the actual numerical computations.  For programming only, you may
    consider doing likewise.

suggesting lm.fit for these types of bare-bones regressions from R (eg in the
context of bootstraps or extended simulations).

You have to think about where your bottlenecks really are.  Maybe it is in
the data preparation and setup with all your rolling winsorized setups. If
that is the case, I'd stay in R.  Otherwise, interface an OLS function from
Lapack etc is not too hard from C/C++ and you even get plenty of examples in
the R sources.

| (I am right now working on getting blas-atlas to compile on my gentoo
| system.  It just died in the compilation over something.)

[ On Debian, it has only been an 'apt-get install' away for almost six
years now. Similarly, Ubuntu has Atlas-enabled R ever since it started. ]

Hth, Dirk

Three out of two people have difficulties with fractions.

More information about the R-help mailing list