[Rd] Inconsistent results from .C()

Thu May 23 02:18:54 CEST 2013

Had you tried using the new-to-3.0.0 options(CBoundCheck=TRUE)?

    [from the NEWS file]
    There is a new option, options(CBoundsCheck=), which controls how .C()
    and .Fortran() pass arguments to compiled code. If true (which can be
    enabled by setting the environment variable R_C_BOUNDS_CHECK to yes),
    raw, integer, double and complex arguments are always copied, and checked
    for writing off either end of the array on return from the compiled code
    (when a second copy is made). This also checks individual elements of character
    vectors passed to .C().

valgrind helps find memory misuse also.

When you write on memory that you didn't allocate, expect to get mystifying
behavior.  Nonrepeatable results are a sign of memory misuse.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf
> Of Robin Evans
> Sent: Wednesday, May 22, 2013 4:16 PM
> To: R Devel List
> Subject: Re: [Rd] Inconsistent results from .C()
> 
> Update: I did eventually discover an error in the C code which is probably
> ultimately responsible for the bug (the variable eff_size is allowed to get
> too large, and overrun the end of lev2[]).  However the behaviour of R (or
> Rstudio) in response to this is still somewhat mystifying to me!
> 
> Robin
> 
> On 21 May 2013 14:10, Robin Evans <rje42 at cam.ac.uk> wrote:
> 
> > Sure!  C code is below if it helps.  The gist is that the function
> > oneMargin forms two matrices M and C, mostly by repeatedly taking
> > Kronecker products.
> >
> > Robin
> >
> > void kronecker (int *A, int *B, int *dima, int *dimb, int *C) {
> >   int k = 0;
> >   int i1,i2,j1,j2;
> >
> >     for (i2 = 0; i2 < dima[1]; i2++) {
> >     for (j2 = 0; j2 < dimb[1]; j2++) {
> >     for (i1 = 0; i1 < dima[0]; i1++) {
> >     for (j1 = 0; j1 < dimb[0]; j1++) {
> >       C[k] = A[i1 + dima[0]*i2]*B[j1 + dimb[0]*j2];
> >       k++;
> >     }}}}
> > }
> >
> > void iterate (int *x, int *lev) {
> >   bool ok = FALSE;
> >   int i=0;
> >
> >   do {
> >     if(x[i] < lev[i]-1) {
> >       x[i]++;
> >       ok = TRUE;
> >     }
> >     else {
> >       x[i] = 0;
> >     }
> >     i++;
> >   }
> >   while (!ok);
> > }
> >
> > void oneMargin(int *Mar, int *Eff, int *neff, int *lev, int *nvar, int
> > *M, int *C) {
> >
> >   int *M2, *mult, *Cj, *Cj2;
> >   int i,j=1,k,l,m;
> >
> >   /* determine size of M to output */
> >   for (i = 0; i < nvar[0]; i++) {
> >     j *= (Mar[i] == 1 ? lev[i]*lev[i] : lev[i]);
> >   }
> >   M2 = malloc(sizeof(int)*j);
> >   mult = malloc(sizeof(int)*j);
> >
> >   M[0] = 1;
> >   int M_size[2] = {1,1}, mult_size[2], Cj_size[2], C_size[2] = {0,0};
> >
> >   for (i = nvar[0]-1; i >= 0; i--) {
> >     if (Mar[i] == 1) {
> >       for (j = 0; j < lev[i]*lev[i]; j++) {
> >         mult[j] =  ((j % (lev[i]+1)) == 0) ? 1 : 0;
> >       }
> >       mult_size[0] = mult_size[1] = lev[i];
> >     }
> >     else {
> >       for (j = 0; j < lev[i]; j++) {
> >         mult[j] =  1;
> >       }
> >       mult_size[0] = 1;
> >       mult_size[1] = lev[i];
> >     }
> >
> >     kronecker(M, mult, M_size, mult_size, M2);
> >     M_size[0] *= mult_size[0];
> >     M_size[1] *= mult_size[1];
> >
> >     /* copy M2 over to M */
> >     for (j=0; j < M_size[0]*M_size[1]; j++) {
> >       M[j] = M2[j];
> >     }
> >   }
> >
> >   free(M2);
> >
> >   /* Create C matrix */
> >   int index[nvar[0]];
> >   int prod;
> >   int eff_size = 0;
> >   /*swapped = 0; */
> >   mult_size[0] = 1;
> >
> >   C_size[0] = 1;
> >   for (j = 0; j < nvar[0]; j++) {
> >     C_size[0] *= (Mar[j] == 1 ? lev[j] : 1);
> >   }
> >
> >   Cj = malloc(sizeof(int)*C_size[0]);
> >   Cj2 = malloc(sizeof(int)*C_size[0]);
> >   int *lev2;
> >   lev2 = malloc(sizeof(int)*nvar[0]);
> >
> >   /* loop over effects */
> >   for (i = 0; i < neff[0]; i++) {
> >     prod = 1;
> >     for (j = 0; j < nvar[0]; j++) {
> >       if (Eff[j + i*nvar[0]]) {
> >         prod *= lev[j]-1;
> >         lev2[eff_size] = lev[j]-1;
> >         index[eff_size] = 0;
> >         eff_size++;
> >        }
> >     }
> >
> >     /* loop over states of effect (excluding corner points) */
> >     for (j = 0; j < prod; j++) {
> >       Cj[0] = 1;
> >       Cj_size[0] = Cj_size[1] = 1;
> >
> >       k = eff_size;
> >       /* loop over variables */
> >       for (l = nvar[0]-1; l >= 0; l--) {
> >         /* skip variables not in margin */
> >         if (Mar[l] == 0) continue;
> >         mult_size[1] = lev[l];
> >
> >         /* kronecker factor depends on whether or not variable is in
> > effect */
> >         if (Eff[i*nvar[0]+l] == 0) {
> >           /* for variables not in effect, just repeat matrix */
> >           for (m = 0; m < lev[l]; m++) {
> >             mult[m] = 1;
> >           }
> >         }
> >         else {
> >           /* otherwise multiply based on state */
> >           k--;
> >           for (m = 0; m < lev[l]; m++) {
> >             mult[m] = (m == index[k]) ? lev[l] - 1 : -1;
> >           }
> >         }
> >         kronecker(Cj, mult, Cj_size, mult_size, Cj2);
> >
> >         Cj_size[1] *= mult_size[1];
> >
> >         /* copy Cj2 over to Cj */
> >         for (k=0; k < Cj_size[0]*Cj_size[1]; k++) {
> >           Cj[k] = Cj2[k];
> >         }
> >         if (Cj_size[0]*Cj_size[1] > C_size[0]) Rprintf("pointer screwup");
> >       }
> >
> >       /* copy row over to C */
> >       for (m = 0; m < C_size[0]; m++) C[C_size[0]*C_size[1]+m] = Cj[m];
> >       C_size[1] += 1;
> >       iterate(index, lev2);
> >     }
> >   }
> >
> >   free(lev2);
> >   free(Cj);
> >   free(Cj2);
> >   free(mult);
> > }
> >
> >
> > On 21 May 2013 14:04, R. Michael Weylandt <michael.weylandt at gmail.com>
> > wrote:
> > >
> > >  It might also help if you can point us to the C code to help debug.
> > >
> > > MW
> > >
> > > On Tue, May 21, 2013 at 10:53 AM, Robin Evans <rje42 at cam.ac.uk> wrote:
> > > > I should add to this that I'm running on Scientific Linux 6.  I later
> > > > noticed that the bug only seems to occur when I run the code from
> > Rstudio,
> > > > and not if I use the terminal directly, so this may be the key to the
> > > > problem.
> > > >
> > > > Robin
> > > >
> > > > On 20 May 2013 16:12, Robin Evans <rje42 at cam.ac.uk> wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> I've run into a problem which is both maddening and rather hard to
> > > >> replicate, therefore I wondered if someone might know of a plausible
> > > >> explanation for it.  I couldn't find anything in the archives, though
> > > >> maybe I'm searching for the wrong thing.
> > > >>
> > > >> I'm calling some C code using .C, and get the vector I'm interested in
> > > >> back as the 7th location in the returned list.  However I find that if
> > > >> I try to inspect the contents of this entry in the list in some ways,
> > > >> I get one answer, and if I look at it in others I get a different
> > > >> answer.  It's quite possible that there's something wrong with the C
> > > >> code, but this doesn't seem to explain why this phenomenon would occur
> > > >> in R.
> > > >>
> > > >> The problem does not always occur - I have to run the code a few times
> > > >> and then call the console when it does, but the commands below show
> > > >> what can happen when it does.  I apologise for not being able to get a
> > > >> cleaner example.  Full code and output is below, but here is a
> > > >> stylised version:
> > > >>
> > > >> The following all give one answer (which is the wrong answer as far as
> > > >> I'm concerned) :
> > > >>  * printing the whole list :
> > > >>           .C(...)     # and looking at the 7th entry
> > > >>  * applying c() to the 7th element of the list
> > > >>           c(.C(...)[[7]])
> > > >>  * assigning the 7th element to a vector:
> > > >>           x = .C(...)[[7]];
> > > >>           x
> > > >>
> > > >> these give a different answer (which is the answer I would hope the C
> > > >> code returns):
> > > >>  * using dput on the 7th entry:
> > > >>           dput(.C(...)[[7]])
> > > >>  * applying c() and then dput()
> > > >>           dput(c(.C(...)[[7]]))
> > > >>  * just printing the 7th entry of the list
> > > >>           .C(...)[[7]]
> > > >>
> > > >> The answers are consistent in the sense that I always get the same
> > > >> answers from running the same command in the console.  I have tried
> > > >> inspecting the returned objects to see if the objects are somehow of a
> > > >> different class than I expect, or are just being printed oddly, but
> > > >> have not found anything untoward.
> > > >>
> > > >> Any suggestions would be much appreciated!
> > > >>
> > > >> Regards,
> > > >>
> > > >> Robin
> > > >>
> > > >>
> > > >> # THESE COMMANDS GIVE ONE ANSWER
> > > >> # [the correct answer always begins with 1, the incorrect with -1]
> > > >>
> > > >> > .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]
> > > >> [1]  1 -1 -1  1 -1  1  1 -1 -1  1  1 -1  1 -1 -1  1 -1  1  1 -1  1 -1
> > > >> -1  1  1 -1 -1  1 -1  1  1 -1
> > > >>
> > > >> > dput(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]])
> > > >> c(1L, -1L, -1L, 1L, -1L, 1L, 1L, -1L, -1L, 1L, 1L, -1L, 1L, -1L,
> > > >> -1L, 1L, -1L, 1L, 1L, -1L, 1L, -1L, -1L, 1L, 1L, -1L, -1L, 1L,
> > > >> -1L, 1L, 1L, -1L)
> > > >>
> > > >> > x=dput(c(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]))
> > > >> c(1L, -1L, -1L, 1L, -1L, 1L, 1L, -1L, -1L, 1L, 1L, -1L, 1L, -1L,
> > > >>   -1L, 1L, -1L, 1L, 1L, -1L, 1L, -1L, -1L, 1L, 1L, -1L, -1L, 1L,
> > > >>   -1L, 1L, 1L, -1L)
> > > >> > x
> > > >> [1]  1 -1 -1  1 -1  1  1 -1 -1  1  1 -1  1 -1 -1  1 -1  1  1 -1  1 -1
> > > >> -1  1  1 -1 -1  1 -1  1  1 -1
> > > >>
> > > >> # THESE ALL GIVE A DIFFERENT ONE!
> > > >>
> > > >> > .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)
> > > >>
> > > >> # (OTHER ELEMENTS OF LIST REMOVED)
> > > >> [[7]]
> > > >> [1] -1 -1  1  1  1  1 -1 -1  1  1 -1 -1 -1 -1  1  1  1  1 -1 -1 -1 -1
> > > >> 1  1 -1 -1  1  1  1  1 -1 -1
> > > >>
> > > >> > c(.C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]])
> > > >> [1] -1 -1  1  1  1  1 -1 -1  1  1 -1 -1 -1 -1  1  1  1  1 -1 -1 -1 -1
> > > >> 1  1 -1 -1  1  1  1  1 -1 -1
> > > >> > x = .C("oneMargin", c(1L,1L,1L,1L,1L), c(1L,1L,1L,1L,1L), 1L,
> > > >> c(2L,2L,2L,2L,2L), 5L, ptr1, ptr2)[[7]]
> > > >> > x
> > > >> [1] -1 -1  1  1  1  1 -1 -1  1  1 -1 -1 -1 -1  1  1  1  1 -1 -1 -1 -1
> > > >> 1  1 -1 -1  1  1  1  1 -1 -1
> > > >>
> > > >>
> > > >> --
> > > >> Robin Evans
> > > >> Statistical Laboratory
> > > >> University of Cambridge
> > > >> blog: itsastatlife.blogspot.com
> > > >> web: www.statslab.cam.ac.uk/~rje42
> > > >>
> > > >> Causal Inference Workshop July 15th:
> > > >> http://www.statslab.cam.ac.uk/~rje42/uai13/main.htm
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Robin Evans
> > > > Statistical Laboratory
> > > > University of Cambridge
> > > > blog: itsastatlife.blogspot.com
> > > > web: www.statslab.cam.ac.uk/~rje42 <
> > http://www.stat.washington.edu/~rje42>
> > > >
> > > > Causal Inference Workshop July 15th:
> > > > http://www.statslab.cam.ac.uk/~rje42/uai13/main.htm
> > > >
> > > >         [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > R-devel at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel