[R] Behaviors of diag() with character vector in R 3.0.0

R. Michael Weylandt michael.weylandt at gmail.com
Tue Apr 9 18:44:17 CEST 2013


On Tue, Apr 9, 2013 at 7:15 AM, Mike Cheung <mikewlcheung at gmail.com> wrote:
> Dear all,
>
> According to CHANGES IN R 3.0.0:
>  o diag() as used to generate a diagonal matrix has been re-written
>       in C for speed and less memory usage.  It now forces the result
>       to be numeric in the case diag(x) since it is said to have 'zero
>       off-diagonal entries'.
>
> diag(x) does not work for character vector in R 3.0.0 any more. For example,
> v <- c("a", "b")
>
> ## R 2.15.3
> diag(v)
>      [,1] [,2]
> [1,] "a"  "0"
> [2,] "0"  "b"
>
> ## R 3.0.0
> diag(v)
>      [,1] [,2]
> [1,]   NA    0
> [2,]    0   NA
> Warning message:
> In diag(v) : NAs introduced by coercion
>
> Regarding the character matrix, it still works. For example,
> m <- matrix(c("a", "b", "c", "d"), nrow=2)
> diag(m)
> ## Both R 2.15.3 and 3.0.0
> [1] "a" "d"
>
> n <- matrix(0, ncol=2, nrow=2)
> diag(n) <- v
> n
> ## Both R 2.15.3 and 3.0.0
>      [,1] [,2]
> [1,] "a"  "0"
> [2,] "0"  "b"
>
> I understand that the above behavior follows exactly what the manual says.
> It appears to me that the version in 2.15.3 is more general as it works for
> both numeric and character vectors and matrices, whereas the version in
> 3.0.0 works for character matrices but not character vectors.
>
> Would it be possible to retain the behaviors of diag() for character
> vectors? Thanks.

Persuant to what the NEWS file says, I'm not sure it's a good idea,
but here's a patch against a recent R-devel which I believe restores
the old behavior. It's not coming out of svn diff too cleanly, but it
applies as it should.

Should it be adopted, someone with more taste might want to move case
VECSXP or case RAWSXP to error out as well.

Michael

Index: array.c
===================================================================
--- array.c (revision 62536)
+++ array.c (working copy)
@@ -1539,26 +1539,41 @@
  error(_("too many elements specified"));
 #endif

-   if (TYPEOF(x) == CPLXSXP) {
+   int nx = LENGTH(x);
+   R_xlen_t NR = nr;
+
+   switch(TYPEOF(x)){
+   case CPLXSXP:
        PROTECT(ans = allocMatrix(CPLXSXP, nr, nc));
-       int nx = LENGTH(x);
-       R_xlen_t NR = nr;
-       Rcomplex *rx = COMPLEX(x), *ra = COMPLEX(ans), zero;
+       Rcomplex *cx = COMPLEX(x), *ca = COMPLEX(ans), zero;
        zero.r = zero.i = 0.0;
-       for (R_xlen_t i = 0; i < NR*nc; i++) ra[i] = zero;
-       for (int j = 0; j < mn; j++) ra[j * (NR+1)] = rx[j % nx];
-  } else {
-       if(TYPEOF(x) != REALSXP) {
+       for(R_xlen_t i = 0; i < NR*nc; i++) ca[i] = zero;
+       for(int j = 0; j < mn; j++) ca[j*(NR+1)] = cx[j % nx];
+       break;
+   case LGLSXP:
+   case REALSXP:
+   case INTSXP:
+   case RAWSXP:
+   case VECSXP:
+       if(TYPEOF(x) != REALSXP){
    PROTECT(x = coerceVector(x, REALSXP));
    nprotect++;
        }
        PROTECT(ans = allocMatrix(REALSXP, nr, nc));
-       int nx = LENGTH(x);
-       R_xlen_t NR = nr;
        double *rx = REAL(x), *ra = REAL(ans);
        for (R_xlen_t i = 0; i < NR*nc; i++) ra[i] = 0.0;
        for (int j = 0; j < mn; j++) ra[j * (NR+1)] = rx[j % nx];
+       break;
+   case STRSXP:
+     PROTECT(ans = allocMatrix(STRSXP, nr, nc));
+     for (R_xlen_t i = 0; i < NR*nc; i++) SET_STRING_ELT(ans, i,
mkChar("0")); // Odd to put character 0 here?
+     for (R_xlen_t j = 0; j < mn; j++) SET_STRING_ELT(ans, j*(NR+1),
STRING_ELT(x, j));
+     break;
+   default:
+     error(_("'data' must be of a vector type, was '%s'"),
+   type2char(TYPEOF(x)));
    }
+
    UNPROTECT(nprotect);
    return ans;
 }


More information about the R-help mailing list