[R] Is R the right choice for simulating first passage times of random walks?

Paul Menzel paulepanter at users.sourceforge.net
Mon Aug 1 01:36:31 CEST 2011


Am Mittwoch, den 27.07.2011, 19:59 -0400 schrieb R. Michael Weylandt :
> Some more skilled folks can help with the curve fitting, but the general
> answer is yes -- R will handle this quite ably.

Great to read that.

> Consider the following code:
> 
> <<-------------------------------------->>
> n = 1e5
> length = 1e5
> 
> R = matrix(sample(c(-1,1),length*n,replace=T),nrow=n)
> R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and will make
> your life INFINITELY better
> R = cbind(rep(0,n),R) ## Now each row is a random walk as you desired.
> 
> <<---------------------------------------->>
> 
> There are actually even faster ways to do what you are asking for, but this
> introduces you to some useful R architecture, above all the apply function.

Thank you very much. I realized the the 0 column is not need when
summing this up. Additionally I posted the wrong example code and I
actually am only interested how long it stays negative from the
beginning.

> To see how long the longest stretch of negatives in each row is, the
> following is a little sneaky but works pretty well:
> 
> countNegative = apply(R,1,function(x){which.max(table(cumsum(x>=0))})
> 
> then you can study these random numbers to do whatever with them.
>
> The gist of this code is that it counts how many positive number have been
> seen up to each point: for any stretch this doesn't increase, you must be
> negative, so this identifies the longest such stretch on each row and
> records the length. (It may be off by one so check it on a smaller R matrix.

That is a great example. It took me a while what `table()` does here but
thanks to your explanation I finally understood it.

[…]

> So all together:
> 
> <<-------------------------------------->>
> n = 1e3
> length = 1e3
> 
> R = matrix(sample(c(-1,1),length*n,replace=T),nrow=n)
> R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and will make
> your life INFINITELY better
> R = cbind(rep(0,n),R) ## Now each row is a random walk as you desired.
> fTemp <- function(x) {
>     return(max(table(cumsum(x>=0))))
> }
> countNegative = apply(R,1,fTemp)
> mu = mean(countNegative)
> sig = sd(countNegative)/sqrt(length(countNegative))
> 
> <<---------------------------------------->>
> 
> This runs pretty fast on my laptop, but you'll need to look into the
> memory.limit() function if you want to increase the simulation parameters.
> There are much faster ways to handle the simulation as well, but this should
> get you off to a nice start with R.
> 
> Hope this helps,

It did. Thank you again for the kind and elaborate introduction.

Trying to run your example right away froze my system using `n = 1000`
and `length = 1e5` [1]. So I really need to be careful how big such a
matrix can get. One thing is to use integers as suggested in [2].

My current code looks like the following.

-------- 8< -------- code -------- >8 --------
f4 <- function(n = 100000, # number of simulations
               length = 100000) # length of iterated sum
{
	R = matrix(sample(c(-1L,1L),length*n,replace=T),nrow=n)
	R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and will make your life INFINITELY better
	fTemp <- function(x) {
		if (x[1] >= 0 ) {
			return(1)
		}

		for (i in 1:length-1) {
			if (x[i] < 0 && x[i + 1] >= 0) {
				return(as.integer(i/2 + 2)) # simple random walks only hit 0 on even “times”
			}
		}
	}
	countNegative = apply(R,2,fTemp)
	tabulate(as.vector(countNegative), length)
}
-------- 8< -------- code -------- >8 --------

1.I could actually avoid `cumsum()` half the time, when the first entry
is already positive. So I am still looking for a way to speed that up in
comparison to a simple two loops scenario.
2. The counting of the length how long the walk stayed negative is
probably also inefficient and I should find a better way on how to
return the values.

I am still thinking about both cases, but to come up with
vectoriazations of the problem is quite hard.

So I welcome any suggestions. ;-)


Thanks,

Paul


[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=635832
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=635832#10
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110801/5108acad/attachment.bin>


More information about the R-help mailing list