[Rd] paste strings in C
    Adrian Dușa 
    dusa.adrian at unibuc.ro
       
    Tue Jun 27 14:29:49 CEST 2017
    
    
  
Dear R-devs,
Below is a small example of what I am trying to achieve, that is trivial in
R and I would like to learn how to do in C, for very large matrices:
> (mymat <- matrix(c(1,0,0,2,2,1), nrow = 2))
     [,1] [,2] [,3]
[1,]    1    0    2
[2,]    0    2    1
And I would like to produce:
[1] "a*C" "B*c"
Which can be trivially done in R via something like:
foo <- function(mymat, colnms, tilde = FALSE) {
    apply(mymat, 1, function(x) {
        if (tilde) {
            colnms[x == 1] <- paste0("~", colnms[x == 1])
        } else {
            colnms[x == 1] <- tolower(colnms[x == 1])
        }
        paste(colnms[x > 0], collapse = "*")
    })
}
> foo(mymat, LETTERS[1:3])
[1] "a*C" "B*c"
> foo(mymat, LETTERS[1:3], tilde = TRUE)
[1] "~A*C" "B*~C"
I know that strings in C are far from trivial (encodings being one
important issue), and this is the sort of thing much easier to do in R. On
the other hand I found that, for a large matrix of say 1 million rows and
25 columns, setting the rownames of colnames in R copies the matrix and
costs a lot of memory and time in the process.
Having all necessary headers in C, the solution I came up with involves
calling the function foo() from within C:
SEXP test(SEXP mymat, SEXP colnms, SEXP tilde) {
    SEXP call = PROTECT(LCONS(install("foo"),
                        LCONS(mymat,
                        LCONS(colnms,
                        LCONS(tilde, R_NilValue)))));
    SEXP out = PROTECT(eval(call, R_GlobalEnv));
    UNPROTECT(2);
    return(out);
}
After compilation, say in a file called test.c, back in R I get:
> dyn.load("test.so")
> .Call("test", mymat, LETTERS[1:3], FALSE)
[1] "a*C" "B*c"
> .Call("test", mymat, LETTERS[1:3], TRUE)
[1] "~A*C" "B*~C"
In my real situation, the matrix I am working on is produced in the C code
(and it's much larger).
I don't know for sure, when calling the R function foo(), if the matrix is
copied: if not, this might be the best solution for me.
Otherwise I know there is a function do_paste() in C, and wondered whether
I could use that directly instead of calling R from C.
I hope this explains what I would like to do, many thanks in advance for
any hint,
Adrian
-- 
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr. 90-92
050663 Bucharest sector 5
Romania
	[[alternative HTML version deleted]]
    
    
More information about the R-devel
mailing list