[R] NOT-SO-SIMPLE function!

Marc Schwartz marc_schwartz at comcast.net
Mon Jun 2 21:59:35 CEST 2008


on 06/02/2008 01:30 PM T.D.Rudolph wrote:
> I am trying to set up a function which processes my data according to the
> following rules:
> 
> 1. if (x[i]==0) NA
> 2. if (x[i]>0) log(x[i]/(number of consecutive zeros immediately preceding
> it +1)) 
> 
> The data this will apply to include a variety of whole numbers not limited
> to 1 & 0, a number of which may appear consecutively and not separated by
> zeros.  Below is an example with a detailed explanation of the output
> desired:
> 
> x <- c(3,2,0,1,0,2,0,0,1,0,0,0,0,4,1) 
> output desired = c(1.098, 0.69, NA, -0.69, NA, -0.41, NA, NA, 1.098, NA, NA,
> NA, NA, -0.22, 0) 
> 
> the 1st element, 3, becomes log(3) = 1.098612 
> the 2nd element, 2, becomes log(2) = 0.6931472 
> the 3rd element, 0, becomes NA (cannot log zero). 
> the 4rd element, 1, becomes log(1/(1(number of consecutive zeros immediately
> preceding it) + 1 (constant))) = log(1/2) =  -0.6931472 
> the 5th element, 0, becomes NA 
> the 6th element, 2, becomes log(2/(1(number of consecutive zeros immediately
> preceding it) + 1 (constant))) = log(2/3) = -0.4054651 

The above should be log(2/2) = 0

There is only 1 consecutive zero preceding the 2 in the 6th position

> the 7th and 8th elements, both zeros, become NA 
> the 9th element, 1, becomes log(1/(2(number of consecutive zeros immediately
> preceding it) + 1 (constant))) = log(1/3) =  1.098612 

The above should be log(1/3) = -1.098612 (negative, not positive)

> the 10-13th elements, all zeros, each become NA 
> the 14th element, 4, becomes log(4/(4(number of consecutive zeros
> immediately preceding it) + 1 (constant))) = log(4/5) = -0.2231436 
> the 15th element, 1, becomes log(1) = 0 
> 
> This one has been in the works for some time and I can't quite seem to crack
> it.
> I would be indebted to anyone who could with success - it seemed so simple
> at the offset!
> Tyler

I am presuming that you have some typos/errors above in your per element 
explanation of the processing of the vector.  If so, then the following 
should work as a first pass and could probably be optimized further:

zeroes <- function(x, i)
{
   if (x[i] == 0) {
     NA
   } else if (i == 1) {
     log(x[i])
   } else if (x[i - 1] != 0) {
     log(x[i])
   } else {
     rz <- rle(x[1:(i-1)])
     log(x[i] / (rz$lengths[length(rz$lengths)] + 1))
   }
}


x <- c(3, 2, 0, 1, 0, 2, 0, 0, 1, 0, 0, 0, 0, 4, 1)


 > sapply(seq(along = x), function(i) zeroes(x, i))
  [1]  1.0986123  0.6931472         NA -0.6931472         NA  0.0000000
  [7]         NA         NA -1.0986123         NA         NA         NA
[13]         NA -0.2231436  0.0000000


See ?rle for more information on the identification of the sequential 
zeroes in the vector.

HTH,

Marc Schwartz



More information about the R-help mailing list