[Rd] Using log() on an openMosix cluster
Torsten Hothorn
Torsten.Hothorn at rzmail.uni-erlangen.de
Mon Nov 24 09:09:19 MET 2003
On Fri, 21 Nov 2003, Roger D. Peng wrote:
> Hi all, I was hoping to get some advice about a problem that I realize
> will be difficult to reproduce for some people. I'm running R 1.7.1 on
> an openMosix (Linux) cluster and have been experiencing some odd
> slow-downs. If anyone has experience with such a setup (or a similar
> one) I'd appreciate any help. Here's a simplified version of the problem.
>
> I'm trying to run the following code:
> ##
> N <- 100000; a <- numeric(N); b <- numeric(N)
> e <- rnorm(N)
>
> for(i in 1:N) {
> a[i] <- exp(e[i])
> b[i] <- log(abs(a[i]))
> }
> ##
>
> When I run it on the head node, everything is fine. However, when I
> send the R process off to one of the cluster nodes (i.e. using mosrun
> from the head node) the program takes about 10 times longer (in
> wall-clock time, cpu time is roughly the same).
>
Did you adapt the sig*jmp definitions in src/include/Defn.h? This was
necessary until R-1.7.1 and is no longer needed, thanks to Luke's changes
in 1.8.0:
o On Unix-like systems interrupt signals now set a flag that is
checked periodically rather than calling longjmp from the
signal handler. This is analogous to the behavior on Windows.
This reduces responsiveness to interrupts but prevents bugs
caused by interrupting computations in a way that leaves the
system in an inconsistent state. It also reduces the number
of system calls, which can speed up computations on some
platforms and make R more usable with systems like Mosix.
I tried the example above with N = 1.000.000:
N <- 1000000; a <- numeric(N); b <- numeric(N)
e <- rnorm(N)
for(i in 1:N) {
a[i] <- exp(e[i])
b[i] <- log(abs(a[i]))
}
cat(proc.time())
with R-1.8.0 with Linux 2.4.22 and OpenMosix-Patch and started 10
processes which migrated immediately.
hothorn at mosix:~/tmp/log$ grep -1 cat *.Rout
log1.Rout-R>
log1.Rout:R> cat(proc.time())
log1.Rout-37.04 1.02 43.44 0 0.01R>
--
log10.Rout-R>
log10.Rout:R> cat(proc.time())
log10.Rout-34.25 0.45 40.21 0 0.01R>
--
log2.Rout-R>
log2.Rout:R> cat(proc.time())
log2.Rout-22.19 0.33 29.36 0 0R>
--
log3.Rout-R>
log3.Rout:R> cat(proc.time())
log3.Rout-24.46 0.42 32.96 0 0.03R>
--
log4.Rout-R>
log4.Rout:R> cat(proc.time())
log4.Rout-36.88 0.38 40.73 0 0.02R>
--
log5.Rout-R>
log5.Rout:R> cat(proc.time())
log5.Rout-34.79 0.52 42.83 0.02 0R>
--
log6.Rout-R>
log6.Rout:R> cat(proc.time())
log6.Rout-34.14 0.54 41.46 0 0.01R>
--
log7.Rout-R>
log7.Rout:R> cat(proc.time())
log7.Rout-35.21 0.66 43.4 0 0R>
--
log8.Rout-R>
log8.Rout:R> cat(proc.time())
log8.Rout-25.27 0.55 33.77 0 0.01R>
--
log9.Rout-R>
log9.Rout:R> cat(proc.time())
log9.Rout-36.69 0.44 43.16 0.01 0R>
So, everything is fine here. I guess using R-1.8.1 will fix your
problem.
Torsten
> Interestingly, when I tried running the following code:
> ##
> N <- 100000; a <- numeric(N); b <- numeric(N)
> e <- rnorm(N)
>
> for(i in 1:N) {
> a[i] <- exp(e[i])
> b[i] <- exp(abs(a[i]))
> }
> ##
>
> I didn't experience any slow-down! That is the wall-clock time is the
> same when run on the head node or on the cluster nodes. The only
> difference between the two programs is that one takes a log in the for()
> loop and the other one takes an exponential.
>
> I guess my question is why would taking the log() produce a 10 fold
> increase in runtime? I know that Mosix clusters can experience serious
> performance hits if you make a lot of system calls or write out data to
> files but I don't think I'm doing that here. Is there some major
> difference in the way that exp() and log() are implemented?
>
> I'm pretty sure this isn't an R problem but I'm wondering if R is doing
> something behind the scenes that's affecting performance in the
> openMosix setting.
>
> Thanks in advance for any help.
>
> -roger
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
>
More information about the R-devel
mailing list