[R] Performance note: Preallocating helps? and two questions

Douglas Bates bates at stat.wisc.edu
Wed Nov 1 17:35:00 CET 2000

Angelo Canty <canty at discrete.Concordia.CA> writes:

> I'm not sure why the first method works so badly although it is
> clearly not something you should do.  You are trying to use indices
> of 1:35833 on a vector of length 1.  With the second method the vector
> that you are using is at least of the correct length.  On my Sun Sparc
> the first method takes 3 minutes and the second takes 9 seconds.
> Interestingly on S-plus 3.4 for Unix, the two methods take about the
> same time (12 seconds each).  My personal feeling is that the first
> method should result in an error message (or at least a warning).

The semantics of the S language allow you to assign an element to a
vector even if the index did not exist previously.  You can use this
to extend vectors.

The way it is implemented is to allocate a new vector, copy the
contents of the previous version of the vector to the leading part,
then install the new element(s).  

That is why the first version of this code is so slow.  The initial
parts of the vector are being copied tens of thousands of times.  In
the second version the new elements can be installed without having to
copy the previous contents.

> In any case there is no need for a loop here since
> AUcap30 <- pmin(AU, 30)
> does what you want in about half a second.

Indeed.  It helps to look for these functions.

I heartily recommend reading section 8.7 of Venables and Ripley's "S
Programming" where they give a checklist for S programmers.  The first
item on their checklist is to decide if you really should be using
pmin or pmax where you have min or max.
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list