[R] Improving data processing efficiency

Horace Tso Horace.Tso at pgn.com
Sat Jun 7 01:46:35 CEST 2008

Daniel, allow me to step off the party line here for a moment, in a problem like this it's better to code your function in C and then call it from R. You get vast amount of performance improvement instantly. (From what I see the process of recoding in C should be quite straight forward.)


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Daniel Folkinshteyn
Sent: Friday, June 06, 2008 4:35 PM
To: hadley wickham
Cc: r-help at r-project.org; Patrick Burns
Subject: Re: [R] Improving data processing efficiency

> install.packages("profr")
> library(profr)
> p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
> plot(p)
> That should at least help you see where the slow bits are.
> Hadley
so profiling reveals that '[.data.frame' and '[[.data.frame' and '[' are
the biggest timesuckers...

i suppose i'll try using matrices and see how that stacks up (since all
my cols are numeric, should be a problem-free approach).

but i'm really wondering if there isn't some neat vectorized approach i
could use to avoid at least one of the nested loops...

R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list