[Rd] Dynamic list creation (SEXP in C) returns error "unimplemented type (29) in 'duplicate'"

Romain Francois romain at r-enthusiasts.com
Thu Nov 7 14:17:39 CET 2013


Hello,

Any particular reason you're not using Rcpp? You would have access to 
nice abstraction instead of these MACROS all over the place.

The cost of these abstractions is close to 0.

Looping around and SET_LENGTH is going to be quite expensive. I would 
urge you to accumulate data in data structures that know how to grow 
efficiently, i.e. a std::vector and then convert that to an R vector 
when you're done with them.

Romain

Le 07/11/2013 14:03, George Vega Yon a écrit :
> Hi!
>
> I didn't wanted to do this but I think that this is the easiest way
> for you to understand my problem (thanks again for all the comments
> that you have made). Here is a copy of the function that I'm working
> on. This may be tedious to analyze, so I understand if you don't feel
> keen to give it a time. Having dedicated many hours to this (as a new
> user of both C and R C API), I would be very pleased to know what am I
> doing wrong here.
>
> G0 is a Nx2 matrix. The first column is a group id (can be shared with
> several observations) and the second tells how many individuals are in
> that group. This matrix can look something like this:
>
> id_group  nreps
> 1  3
> 1  3
> 1  3
> 2  1
> 3  1
> 4  2
> 5  1
> 6  1
> 4  2
> ...
>
> L0 is list of two column data.frames with different sizes. The first
> column (id) are row indexes (with values 1 to N) and the second column
> are real numbers. L0 can look something like this
> [[1]]
> id  lambda
> 3  0.5
> 15  0.3
> 25  0.2
> [[2]]
> id  lambda
> 15  0.8
> 40  0.2
> ...
> [[N]]
> id  lambda
> 80  1
>
> TE0 is a int scalar in {0,1,2}
>
> T0 is a dichotomous vector of length N that can look something like this
> [1] 0 1 0 1 1 1 0 ...
> [N] 1
>
> L1 (the expected output) is a modified version of L0, that, for
> instance can look something like this (note the rows marked with "*")
>
> [[1]]
> id  lambda
> 3  0.5
> *15  0.15 (15 was in the same group of 50, so I added this new row and
> divided the value of lambda by two)
> 25  0.2
> *50  0.15
> [[2]]
> id  lambda
> 15  0.8
> 40  0.2
> ...
> [[N]]
> id  lambda
> *80  0.333 (80 shared group id with 30 and 100, so lambda is divided by 3)
> *30  0.333
> *100 0.333
>
> That said, the function is as follows
>
> SEXP distribute_lambdas(
>    SEXP G0,  // Groups ids (matrix of Nx2). First column = Group Id,
> second column: Elements in the group
>    SEXP L0,  // List of N two-column dataframes with different number of rows
>    SEXP TE0, // Treatment effect (int scalar): ATE(0) ATT(1) ATC(2)
>    SEXP T0   // Treat var (bool vector, 0/1, of size N)
> )
> {
>
>    int i, j, l, m;
>    const int *G = INTEGER_POINTER(PROTECT(G0 = AS_INTEGER(G0 )));
>    const int *T = INTEGER_POINTER(PROTECT(T0 = AS_INTEGER(T0 )));
>    const int *TE= INTEGER_POINTER(PROTECT(TE0= AS_INTEGER(TE0)));
>    double *L, val;
>    int *I, nlambdas, nreps;
>
>    const int n = length(T0);
>
>    PROTECT_INDEX pin0, pin1;
>    SEXP L1;
>    PROTECT(L1 = allocVector(VECSXP,n));
>    SEXP id, lambda;
>
>    // Fixing size
>    for(i=0;i<n;i++)
>    {
>      SET_VECTOR_ELT(L1, i, allocVector(VECSXP, 2));
>    //  SET_VECTOR_ELT(VECTOR_ELT(L1,i), 0, NEW_INTEGER(100));
>    //  SET_VECTOR_ELT(VECTOR_ELT(L1,i), 1, NEW_NUMERIC(100));
>    }
>
>    // For over the list, i.e observations
>    for(i=0;i<n;i++)
>    {
>
>      R_CheckUserInterrupt();
>
>      // Checking if has to be analyzed.
>      if (
>        ((TE[0] == 1 & !T[i]) | (TE[0] == 2 & T[i])) |
>        (length(VECTOR_ELT(L0,i)) != 2)
>      )
>      {
>        SET_VECTOR_ELT(L1,i,R_NilValue);
>        continue;
>      }
>
>      // Checking how many rows does the i-th data.frame has
>      nlambdas = length(VECTOR_ELT(VECTOR_ELT(L0,i),0));
>
>      // Pointing to the data.frame's origianl values
>      I = INTEGER_POINTER(AS_INTEGER(PROTECT(VECTOR_ELT(VECTOR_ELT(L0,i),0))));
>      L = NUMERIC_POINTER(AS_NUMERIC(PROTECT(VECTOR_ELT(VECTOR_ELT(L0,i),1))));
>
>      // Creating a copy of the pointed values
>      PROTECT_WITH_INDEX(id   = duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),0)), &pin0);
>      PROTECT_WITH_INDEX(lambda=duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),1)), &pin1);
>
>      // Over the rows of the i-th data.frame
>      nreps=0;
>      for(l=0;l<nlambdas;l++)
>      {
>        // If the current lambda id is repeated, ie ther are more individuals
>        // with the same covariates, then enter.
>        if (G[n+I[l]-1] > 1)
>        {
>          /* Changing the length of the object */
>          REPROTECT(SET_LENGTH(id,    length(lambda) + G[n+I[l]-1] -1), pin0);
>          REPROTECT(SET_LENGTH(lambda,length(lambda) + G[n+I[l]-1] -1), pin1);
>
>          // Getting the new value
>          val = L[l]/G[n+I[l] - 1];
>          REAL(lambda)[l] = val;
>
>          // Looping over the full set of groups
>          m = -1,j = -1;
>          while(m < (G[n+I[l]-1] - 1))
>          {
>            // Looking for individuals in the same group
>            if (G[++j] != G[I[l]-1]) continue;
>
>            // If it is the current lambda, then do not asign it
>            if (j == (I[l] - 1)) continue;
>
>            INTEGER(id)[length(id) - (G[n+I[l]-1] - 1) + ++m] = j+1;
>            REAL(lambda)[length(id) - (G[n+I[l]-1] - 1) + m] = val;
>          }
>
>          nreps+=1;
>        }
>      }
>
>      if (nreps)
>      {
>        // Replacing elements from of the list (modified)
>        SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id));
>        SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda));
>      }
>      else {
>        // Setting the list with the old elements
>        SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0,
>          duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),0)));
>        SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1,
>          duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),1)));
>      }
>
>      // Unprotecting elements
>      UNPROTECT(4);
>    }
>
>    Rprintf("Exito\n") ;
>    UNPROTECT(4);
>
>    return L1;
> }
>
> Thanks again in advanced.
>
> George Vega Yon
> +56 9 7 647 2552
> http://ggvega.cl
>
> 2013/11/5 George Vega Yon <g.vegayon at gmail.com>:
>> Either way, understanding that it may not be the best way of do it, is
>> there anything wrong in what I'm doing??
>> George Vega Yon
>> +56 9 7 647 2552
>> http://ggvega.cl
>>
>>
>> 2013/11/5 Gabriel Becker <gmbecker at ucdavis.edu>:
>>> George,
>>>
>>> My point is you don't need to create them and then grow them....
>>>
>>>
>>> for(i=0;i<n;i++)
>>> {
>>>    // Creating the "id" and "lambda" vectors. I do this in every repetition
>>> of
>>>    // the loop.
>>>
>>>    // ... Some other instructions where I set the value of an integer
>>>    // z, which tells how much do the vectors have to grow ...
>>>
>>> PROTECT(id=allocVector(INTSXP, 4 +z));
>>> PROTECT(lambda=allocVector(REALSXP, 4 +z));
>>>
>>>
>>>    // ... some lines where I fill the vectors ...
>>>
>>>    // Storing the new vectors at the i-th element of the list
>>>    SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id));
>>>    SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda));
>>>
>>>    // Unprotecting the "id" and "lambda" vectors
>>>    UNPROTECT(2);
>>> }
>>>
>>> ~G
>>>
>>>
>>> On Tue, Nov 5, 2013 at 1:56 PM, George Vega Yon <g.vegayon at gmail.com> wrote:
>>>>
>>>> Gabriel,
>>>>
>>>> While the length (in terms of number of SEXP elements it stores) of L1
>>>> doesn't changes, the vectors within L1 do (sorry if I didn't explained
>>>> it well before).
>>>>
>>>> The post was about a SEXP object that grows, in my case, every pair of
>>>> vectors in L1 (id and lambda) can change lengths, this is why I need
>>>> to reprotect them. I populate the i-th element of L1 by creating the
>>>> vectors "id" and "lambda", setting the length of these according to
>>>> some rule (that's the part where lengths change)... here is a reduced
>>>> form of my code:
>>>>
>>>> //////////////////////////////////////// C
>>>> ////////////////////////////////////////
>>>> const int = length(L0);
>>>> SEXP L1;
>>>> PROTECT(L1 = allocVector(VECSXP,n));
>>>> SEXP id, lambda;
>>>>
>>>> // Fixing size
>>>> for(i=0;i<n;i++)
>>>>    SET_VECTOR_ELT(L1, i, allocVector(VECSXP, 2));
>>>>
>>>> for(i=0;i<n;i++)
>>>> {
>>>>    // Creating the "id" and "lambda" vectors. I do this in every repetition
>>>> of
>>>>    // the loop.
>>>>    PROTECT_WITH_INDEX(id=allocVector(INTSXP, 4), &ipx0);
>>>>    PROTECT_WITH_INDEX(lambda=allocVector(REALSXP, 4), &ipx1);
>>>>
>>>>    // ... Some other instructions where I set the value of an integer
>>>>    // z, which tells how much do the vectors have to grow ...
>>>>
>>>>    REPROTECT(SET_LENGTH(id,    length(lambda) + z), ipx0);
>>>>    REPROTECT(SET_LENGTH(lambda,length(lambda) + z), ipx1);
>>>>
>>>>    // ... some lines where I fill the vectors ...
>>>>
>>>>    // Storing the new vectors at the i-th element of the list
>>>>    SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id));
>>>>    SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda));
>>>>
>>>>    // Unprotecting the "id" and "lambda" vectors
>>>>    UNPROTECT(2);
>>>> }
>>>>
>>>> UNPROTECT(1);
>>>>
>>>> return L1;
>>>> //////////////////////////////////////// C
>>>> ////////////////////////////////////////
>>>>
>>>> I can't set the length from the start because every pair of vectors in
>>>> L1 have different lengths, lengths that I cannot tell before starting
>>>> the loop.
>>>>
>>>> Thanks for your help,
>>>>
>>>> Regards,
>>>>
>>>> George Vega Yon
>>>> +56 9 7 647 2552
>>>> http://ggvega.cl
>>>>
>>>>
>>>> 2013/11/5 Gabriel Becker <gmbecker at ucdavis.edu>:
>>>>> George,
>>>>>
>>>>> I don't see the relevance of the stackoverflow post you linked. In the
>>>>> post,
>>>>> the author wanted to change the length of an existing "mother list"
>>>>> (matrix,
>>>>> etc), while you specifically state that the length of L1 will not
>>>>> change.
>>>>>
>>>>> You say that the child lists (vectors if they are INTSXP/REALSXP) are
>>>>> variable, but that is not what the linked post was about unless I am
>>>>> completely missing something.
>>>>>
>>>>> I can't really say more without knowing the details of how the vectors
>>>>> are
>>>>> being created and why they cannot just have the right length from the
>>>>> start.
>>>>>
>>>>> As for the error, that is a weird one. I imagine it means that a SEXP
>>>>> thinks
>>>>> that it has a type other than ones defined in Rinternals. I can't speak
>>>>> to
>>>>> how that could have happened from what you posted though.
>>>>>
>>>>> Sorry I can't be of more help,
>>>>> ~G
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 4, 2013 at 8:00 PM, George Vega Yon <g.vegayon at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Dear R-devel,
>>>>>>
>>>>>> A couple of weeks ago I started to use the R C API for package
>>>>>> development. Without knowing much about C, I've been able to write
>>>>>> some routines sucessfully... until now.
>>>>>>
>>>>>> My problem consists in dynamically creating a list ("L1") of lists
>>>>>> using .Call, the tricky part is that each element of the "mother list"
>>>>>> contains two vectors (INTSXP and REALEXP types) with varying sizes;
>>>>>> sizes that I set while I'm looping over another list's ("L1") elements
>>>>>>   (input list). The steps I've follow are:
>>>>>>
>>>>>> FIRST: Create the "mother list" of size "n=length(L0)" (doesn't
>>>>>> change) and protect it as
>>>>>>    PROTECT(L1=allocVector(VECEXP, length(L0)))
>>>>>> and filling it with vectors of length two:
>>>>>>    for(i=0;i<n;i++) SET_VECTOR_ELT(L1,i, allocVector(VECSXP, 2));
>>>>>>
>>>>>> then, for each element of the mother list:
>>>>>>
>>>>>>    for(i=0;i<n;i++) {
>>>>>>
>>>>>> SECOND: By reading this post in Stackoverflow
>>>>>>
>>>>>>
>>>>>> http://stackoverflow.com/questions/7458364/growing-an-r-matrix-inside-a-c-loop/7458516#7458516
>>>>>> I understood that it was necesary to (1) create the "child lists" and
>>>>>> protecting them with PROTECT_WITH_INDEX, and (2) changing its size
>>>>>> using SETLENGTH (Rf_lengthgets) and REPROTECT ing the lists in order
>>>>>> to tell the GC that the vectors had change.
>>>>>>
>>>>>> THIRD: Once my two vectors are done ("id" and "lambda"), assign them
>>>>>> to the i-th element of the "mother list" L1 using
>>>>>>    SET_VECTOR_ELT(VECTOR_ELT(L1,i), 0, duplicate(id));
>>>>>>    SET_VECTOR_ELT(VECTOR_ELT(L1,i), 1, duplicate(lambda));
>>>>>>
>>>>>> and unprotecting the elements protected with index: UNPROTECT(2);
>>>>>>
>>>>>> }
>>>>>>
>>>>>> FOURTH: Unprotecting the "mother list" (L1) and return it to R
>>>>>>
>>>>>> With small datasets this works fine, but after trying with bigger ones
>>>>>> R (my code) keeps failing and returning a strange error that I haven't
>>>>>> been able to identify (or find in the web)
>>>>>>
>>>>>>    "unimplemented type (29) in 'duplicate'"
>>>>>>
>>>>>> This happens right after I try to use the returned list from my
>>>>>> routine (trying to print it or building a data-frame).
>>>>>>
>>>>>> Does anyone have an idea of what am I doing wrong?
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> PS: I didn't wanted to copy the entire function... but if you need it
>>>>>> I can do it.
>>>>>>
>>>>>> George Vega Yon
>>>>>> +56 9 7 647 2552
>>>>>> http://ggvega.cl
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gabriel Becker
>>>>> Graduate Student
>>>>> Statistics Department
>>>>> University of California, Davis
>>>
>>>
>>>
>>>
>>> --
>>> Gabriel Becker
>>> Graduate Student
>>> Statistics Department
>>> University of California, Davis
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30



More information about the R-devel mailing list