[Rd] translateChar in NewName in bind.c

Suharto Anggono Suharto Anggono suharto_anggono at yahoo.com
Tue Aug 1 18:54:34 CEST 2017


For the 2nd example, I say that R 3.4.1 result is acceptable, as names(c(x)) and names(x) are equal.

The change exposed by the 2nd example is in line with statement of the NEWS item corresponding to PR#17284: "c() and unlist() are now more efficient in constructing the names(.) of their return value, ...." However, currently, the NEWS item is for R-devel, not R 3.4.1 patched.

--------------------------------------------
On Mon, 31/7/17, Martin Maechler <maechler at stat.math.ethz.ch> wrote:

 Subject: Re: [Rd] translateChar in NewName in bind.c

 Cc: r-devel at r-project.org
 Date: Monday, 31 July, 2017, 8:38 PM
 
>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org>
>>>>>     on Sun, 30 Jul 2017 14:57:53 +0000 writes:

    > R devel's bind.c has been ported to R patched. Is it OK while names of 'unlist' or 'c' result may be not strictly the same as in R 3.4.1 because of changed function 'NewName' in bind.c?

    > Using 'translateCharUTF8' instead of 'translateChar' is as it should be. It has an effect in non-UTF-8 locale for this example.

    > x <- list(1:2)
    > names(x) <- "\ue7"
    > res <- unlist(x)
    > charToRaw(names(res)[1])

    > Directly assigning 'tag' to 'ans' is more efficient, but
    > may be different from in R 3.4.1 that involves
    > 'translateCharUTF8', that is also correct. It has an
    > effect for this example. 

    > x <- 0
    > names(x) <- "\xe7"
    > Encoding(names(x)) <- "latin1"
    > res <- c(x)
    > Encoding(names(res))
    > charToRaw(names(res))

Yes, you are right, thank you:

That part of the changes in bind.c was *not* directly related to
the two R-bugs (PR#17284 & PR#17292)... and therefore, maybe I
should not have ported it to R-patched (= R 3.4.1 patched).

Your examples above are instructive..  notably the 2nd one seems
to demonstrate to me, that the change also *did* fix a bug:

   Encoding(names(res))

is "latin1" in R-devel  but interestingly is "UTF-8" in R 3.4.1,
indeed independently of the locale.

I would argue R-devel (and current R-patched) is more faithful
by keeping the Encoding "latin1" that was set for names(x) also
in the  names(c(x)) .

I could revert R-patched's  bind.c (so it only contains the two
official bug fixes PR#172(84|92)   but I wonder if it is
desirable in this case.

I'm glad for further reasoning.
Given current "knowledge"/"evidence",  I would not  revert
R-patched to R 3.4.1's behavior.

Martin



More information about the R-devel mailing list