[Rd] serialization regression in 2.15.0 beta

Ben Goodrich bg2382 at columbia.edu
Sat Mar 24 00:13:46 CET 2012


Hi,

I am experiencing a problem related to serialization behavior in  
2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from  
svn) that is not present in 2.14.2 (binary from Debian testing).

I don't fully understand the problem. Also, I tried but have not yet  
been able to create a small, self-contained example that reproduces  
the problem. However, I do have a large, not self-contained example,  
which requires an alpha version (not yet on CRAN) of the mi package  
(the mi package on CRAN would not exhibit this issue). Anyone  
interested in reproducing the problem can follow the readme.txt file  
in this directory:

http://www.columbia.edu/~bg2382/mi/serialization/

I track r-devel with git-svn and was able to git bisect to svn commit r58219

commit 799102bd9d0266fe89c3120981decf0b1f17ef11
Author: ripley <ripley at 00db46b3-68df-0310-9c12-caf00c1e9a41>
Date:   Sat Jan 28 15:02:34 2012 +0000

     make use of non-xdr serialization;.

although this commit could merely expose the problem rather than cause it.

The problem occurs when the FUN called by mclapply() in the parallel  
package returns a S4 object that contains a slot (called X) that is a  
large matrix, specifically a "model matrix" similar to that produced  
by glm(). Some columns of this matrix get corrupted with wrong values  
(usually zero, but sometimes NaN or 10^300ish), which can be seen by  
examining X right before FUN returns (to mclapply()'s environment) and  
comparing to the "same" X after mclapply() returns to the calling  
environment.

Part of svn commit r58219 is this hunk

diff --git a/src/library/parallel/R/unix/mcfork.R  
b/src/library/parallel/R/unix/mcfork.R
index 8e27534..4f92193 100644
--- a/src/library/parallel/R/unix/mcfork.R
+++ b/src/library/parallel/R/unix/mcfork.R
@@ -82,7 +82,8 @@ mckill <- function(process, signal = 2L)
  ## used by mcparallel, mclapply
  sendMaster <- function(what)
  {
-    if (!is.raw(what)) what <- serialize(what, NULL, FALSE)
+    # This is talking to the same machine, so no point in using xdr.
+    if (!is.raw(what)) what <- serialize(what, NULL, xdr = FALSE)
      .Call(C_mc_send_master, what, PACKAGE = "parallel")
  }

Contrary to the comment, I have found that if I specify xdr = TRUE, I  
get the expected (non-corrupted X slot) behavior in 2.16.0, even  
though it is forking locally on my 64bit Debian laptop with a little  
endian i7 processor, whose specs are

goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping        : 7
microcode       : 0x17
cpu MHz         : 800.000
cache size      : 6144 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge  
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe  
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl  
xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl  
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt  
tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts  
tpr_shadow vnmi flexpriority ept vpid
bogomips        : 3990.83
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

...

processor       : 7
[same as processor 0]

So, to summarize I get the good behavior on R 2.14.2 when using  
mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using  
mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad  
behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply().

My session info:

> sessionInfo()
R version 2.15.0 beta (2012-03-16 r58769)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
  [1] mi_0.9-83        bigmemory_4.2.11 arm_1.5-03       foreign_0.8-49
  [5] abind_1.4-0      R2WinBUGS_2.1-18 coda_0.14-5      lme4_0.999375-42
  [9] Matrix_1.0-4     lattice_0.20-0   MASS_7.3-17

loaded via a namespace (and not attached):
[1] grid_2.15.0  nlme_3.1-103

Thanks,
Ben



More information about the R-devel mailing list