[R] coding logic and syntax in R

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Dec 24 09:09:18 CET 2003


On Wed, 24 Dec 2003, Pravin wrote:

> I am a beginner in R programming and recently heard about this mailing list.
> Currently, I am trapped into a simple problem for which I just can't find a
> solution. I have a huge dataset (~81,000 observations) that has been

BTW, that is quite a small dataset these days: not even 10 million is `huge'.

> analyzed and the final result is in the form of 0 and 1(one column). 
> 
> I need to write a code to process this column in a little complicated way. 
> These 81,000 observations are actually 9,000 sets (81,000/9). 
> So, in each set whenever zero appears, rest all observations become zero.
> 
> For example;
> 
> If the column has: 
> 
> 111110111111011111111111111111111....
> 
> The output should look like: 
> 
> 111110000111000000111111111111111...

Let me see if I understand you.  This was really

111110111
111011111
111111111
111111...

and you want

111110000
111000000
111111111
111111...

So let's treat it as a matrix (extending to 4 complete sets):

x <- as.numeric(strsplit("111110111111011111111111111111111011", NULL)[[1]])
xx <- matrix(x, ncol=9, byrow=TRUE)

Then a simple loop

for(i in 2:9) xx[,i] <- xx[,i] & xx[,i-1]

give me the second matrix, which I can read out as a vector as

as.vector(t(xx))
[1] 1 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0

or in what I understand as your format

paste(t(xx), collapse="")
[1] "111110000111000000111111111111111000"

Doing this with 81000 random 0/1's took a fraction of a second.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list