[R] Confused - better empirical results with error in data

Mark Knecht markknecht at gmail.com
Mon Sep 7 23:34:40 CEST 2009


On Mon, Sep 7, 2009 at 1:22 PM, Noah Silverman<noah at smartmediacorp.com> wrote:
<SNIP>
>
> The data is listed in our CSV file from newest to oldest.  We are supposed
> to calculated a valued that is an "average" of some items.  We loop through
> some queries to our database and increment two variables - $total_found and
> $total_score.  The final value is simply $total_score / $total_found.
>
<SNIP>

This does seem like it's rife with possibilities for non-causal
action. (Assuming you process from newest toward oldest which is what
I think you say you are doing...) I'm pretty sure that if I knew that
the Dow was going to be higher 3 months from now then my day trading
results would tend toward long vs short and I'd do better.
Unfortunately I don't know where it will be and cannot really do that.

Have you considered processing the data in the other direction. Not in
R, but rather reversing the data frame or better yet writing the csv
file in date order?

Cheers,
Mark




More information about the R-help mailing list