[Rd] OOP performance, was: V2.9.0 changes [SEC=Unclassified]

Troy Robertson Troy.Robertson at aad.gov.au
Fri Jul 3 01:39:51 CEST 2009

Hi Thomas,

It is a population-based model, but I didn't develop the work.  I am just the programmer who has been given the job of coding it.  The goal is to allow for a plug and play type approach by users to construction of the model (of both elements and functionality).  Hence my focus on OO.

You are right about avoiding the copying of large objects.  That is what was killing things.  I am now working on vectorizing more of the number crunching and removing some of the nested for loops.  That should step things up a little too.

I do also need to investigate how to move some of the more expensive code to C.

Had a quick look at simecol which looks interesting.  Will point it out to my boss to check out too.



> -----Original Message-----
> From: Thomas Petzoldt [mailto:Thomas.Petzoldt at tu-dresden.de]
> Sent: Friday, 3 July 2009 1:31 AM
> To: Troy Robertson
> Cc: 'r-devel at R-project.org'
> Subject: OOP performance, was: [Rd] V2.9.0 changes [SEC=Unclassified]
> Hi Troy,
> first of all a question, what kind of ecosystem models are you
> developing in R? Differential equations or individual-based?
> Your write that you are a frustrated Java developer in R. I have a
> similar experience, however I still like JAVA, and I'm now more happy
> with R as it is much more efficient (i.e. sum(programming + runtime))
> for the things I usually do: ecological data analysis and modelling.
> After using functional R quite a time and Java in parallel
> I had the same idea, to make R more JAVA like and to model ecosystems in
> an object oriented manner. At that time I took a look into R.oo (thanks
> Henrik Bengtssson) and was one of the Co-authors of proto. I still think
> that R.oo is very good and that proto is a cool idea, but finally I
> switched to the recommended S4 for my ecological simulation package.
> Note also, that my solution was *not* to model the ecosystems as objects
> (habitat - populations- individuals), but instead to model ecological
> models (equations, inputs, parameters, time steps, outputs, ...).
> This works quite well with S4. A speed test (see useR!2006 poster on
> http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
> quite comparable performance.
> The only thing I have to have in mind are a few rules:
> - avoid unnecessary copying of large objects. Sometimes it helps to
> prefer matrices over data frames.
> - use vectorization. This means for an individual-based model that one
> has to re-think how to model an individual: not "many [S4] objects"
> like in JAVA, but R structures (arrays, lists, data frames) where
> vectorized functions (e.g. arithmetics or subset) can work with.
> - avoid interpolation (i.e. approx) and if unavoidable, minimize the
> tables.
> If all these things do not help, I write core functions in C (others use
> Fortran). This can be done in a mixed style and even a full C to C
> communication is possible (see the deSolve documentation how to do this
> with differential equation models).
> Thomas P.
> --
> Thomas Petzoldt
> Technische Universitaet Dresden
> Institut fuer Hydrobiologie        thomas.petzoldt at tu-dresden.de
> 01062 Dresden                      http://tu-dresden.de/hydrobiologie/


    Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not the
intended recipient, you are notified that use or dissemination of this communication is
strictly prohibited by Commonwealth law. If you have received this transmission in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and
DELETE the message.
        Visit our web site at http://www.antarctica.gov.au/

More information about the R-devel mailing list