[R] A speed improvement challenge

Frank E Harrell Jr fharrell at virginia.edu
Thu Oct 25 19:01:20 CEST 2001

I need to create a vector of probability-weighted randomly
sampled values which each satisfy a different criterion.
This causes the sampling weights to vary between elements
of the vector.  Here is some example code, with x, y,
and freq being vectors each of length n.  aty is a
vector of length m.

  yinv <- double(m)
  for(i in 1:m) {
    s <- abs(y-aty[i]) < del
    yinv[i] <- if(any(s)) 
     sample(x[s], 1, replace=F, prob=freq[s]/sum(freq[s])) else
     approx(y, x, xout=aty[i], rule=2)$y

Big picture: For a tabulated function given by (x,y) and
frequency of each occurrence of (x,y) given by the corresponding
element in freq, find the inverse of the function (x as a function
of y) for a vector of desired y-values aty[1],...aty[m].  If
the function has a nearly flat segment, let the resulting x[i] value
be a weighted randomly sampled x such that f(x) is within del of
the target y-value aty.  If no tabulated y is within del of the
target y, use reverse linear interpolation to get y inverse.
The reverse linear interpolation can easily be vectorized
(approx(y, x, xout=aty, rule=2)$y).

Thanks for any ideas.
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list