[Rd] Bug in so_strsplit (PR#13742)

waku at idi.ntnu.no waku at idi.ntnu.no
Tue Jun 2 11:30:14 CEST 2009


Full_Name: Wacek Kusnierczyk
Version: 2.10.0 r48689
OS: Ubuntu 8.04 Linux 32b
Submission from: (NULL) (129.241.199.78)


src/main/character.c:435-438 (do_strsplit) contains the following code:

    for (i = 0; i < tlen; i++)
        if (getCharCE(STRING_ELT(tok, 0)) == CE_UTF8) use_UTF8 = TRUE;
    for (i = 0; i < len; i++)
        if (getCharCE(STRING_ELT(x, 0)) == CE_UTF8) use_UTF8 = TRUE;

both loops iterate over loop-invariant expressions and statements.
either the loops are redundant, or the fixed index '0' is copied over from some
other place and should be replaced with 'i'.

the bug can be fixed with 

    for (i = 0; i < tlen; i++)
        if (getCharCE(STRING_ELT(tok, i)) == CE_UTF8) {
            use_UTF8 = TRUE;
            break; }
    for (i = 0; i < len; i++)
        if (getCharCE(STRING_ELT(x, i)) == CE_UTF8) {
            use_UTF8 = TRUE;
            break; }
            
or with

   #define CHECK_CE(CHARACTER, LENGTH, USEUTF8) \
      for (i = 0; i < (LENGTH); i++) \
         if (getCharCE(STRING_ELT((CHARACTER), i)) == CE_UTF8) { \
            (USEUTF8) = TRUE; \
            break; }
   CHECK_CE(tok, tlen, use_UTF8)
   CHECK_CE(x, len, use_UTF8)



More information about the R-devel mailing list