[R] source script file that contains Unicode non-English characters

Faridedin Cheraghi |@r|dcher @end|ng |rom gm@||@com
Fri Aug 17 16:07:59 CEST 2018


Dear Duncan,

thanks for your feedback on this. Even though most developers are not in
Windows (which I doubt it), there are a huge number of people who use R on
Windows and I am one of them who seriously work with R. Following my own
workaround to this bug, now I hit another issue with another workaround
when trying to render the Farsi Unicode characters. While these workarounds
work in ad hoc, they are not appealing in all scenarios;I hit other
problems related to this bug, e.g., when documenting a package with
Roxygen2 package.

Please see the attached files (r scripts) for the complete bug report.

thanks
Farid

On Sun, Aug 12, 2018 at 9:03 PM, Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:

> On 12/08/2018 11:48 AM, Faridedin Cheraghi wrote:
>
>> that's right and I don't want to change my locale. my sessionInfo() :
>>
>
> I think it could be another manifestation of a known bug on Windows, where
> strings are converted from UTF-8 to the current locale and back to UTF-8, a
> lossy conversion.  This has been present for many years, and requires a lot
> of internal changes to fix, so I wouldn't hold your breath waiting for a
> fix.
>
> I believe the "right" fix is for R to always convert strings to UTF-8
> internally.  This wasn't possible when the internationalization code was
> added many years ago because not all platforms supported UTF-8.  It would
> be a lot of work now, and since it isn't needed now on the platforms most
> developers use, it's not receiving a lot of attention.
>
> Your workaround
>
> file(script,
>      encoding = "UTF-8") %T>%
>      source() %>%
>      close()   # works fine
>
> is a nice way to avoid this problem.
>
> Duncan Murdoch
>
>
>> R version 3.5.1 (2018-07-02)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows >= 8 x64 (build 9200)
>>
>> Matrix products: default
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> thanks
>>
>> On Sun, Aug 12, 2018 at 8:00 PM, Duncan Murdoch <murdoch.duncan using gmail.com
>> <mailto:murdoch.duncan using gmail.com>> wrote:
>>
>>     On 12/08/2018 3:09 AM, Faridedin Cheraghi wrote:
>>
>>         It was actually a .rmd file so you can get the coloring of the
>>         bug report
>>         in your text editor. I changed the format to .txt.
>>
>>
>>     When I run your script on a Mac (in a UTF-8 locale), all lines work
>>     as expected.  I'm guessing you are working on Windows, in a
>>     non-UTF-8 locale?
>>
>>     Posting sessionInfo() would be helpful.
>>
>>     Duncan Murdoch
>>
>>
>>
>>         -Farid
>>
>>         On Sun, Aug 12, 2018 at 7:24 AM, Jeff Newmiller
>>         <jdnewmil using dcn.davis.ca.us <mailto:jdnewmil using dcn.davis.ca.us>>
>>         wrote:
>>
>>             ... and read the Posting Guide... only a few file types will
>>             ever make it
>>             through the mailing list so repeatedly sending files not
>>             among those few
>>             types would just be frustrating for everyone.
>>
>>             On August 11, 2018 4:51:43 PM PDT, Jim Lemon
>>             <drjimlemon using gmail.com <mailto:drjimlemon using gmail.com>> wrote:
>>
>>                 Hi Farid,
>>                 Whatever you attached has not gotten through.
>>
>>                 Jim
>>
>>                 On Sat, Aug 11, 2018 at 6:47 PM, Farid Ch
>>                 <faridcher using gmail.com <mailto:faridcher using gmail.com>> wrote:
>>
>>                     Hi all,
>>
>>                     Please check the attached file.
>>
>>                     Thanks
>>                     Farid
>>
>>
>>                     ______________________________________________
>>                     R-help using r-project.org <mailto:R-help using r-project.org>
>>                     mailing list -- To UNSUBSCRIBE and more, see
>>                     https://stat.ethz.ch/mailman/listinfo/r-help
>>                     <https://stat.ethz.ch/mailman/listinfo/r-help>
>>                     PLEASE do read the posting guide
>>
>>                 http://www.R-project.org/posting-guide.html
>>                 <http://www.R-project.org/posting-guide.html>
>>
>>                     and provide commented, minimal, self-contained,
>>                     reproducible code.
>>
>>
>>                 ______________________________________________
>>                 R-help using r-project.org <mailto:R-help using r-project.org>
>>                 mailing list -- To UNSUBSCRIBE and more, see
>>                 https://stat.ethz.ch/mailman/listinfo/r-help
>>                 <https://stat.ethz.ch/mailman/listinfo/r-help>
>>                 PLEASE do read the posting guide
>>                 http://www.R-project.org/posting-guide.html
>>                 <http://www.R-project.org/posting-guide.html>
>>                 and provide commented, minimal, self-contained,
>>                 reproducible code.
>>
>>
>>             --
>>             Sent from my phone. Please excuse my brevity.
>>
>>
>>
>>             ______________________________________________
>>             R-help using r-project.org <mailto:R-help using r-project.org> mailing
>>             list -- To UNSUBSCRIBE and more, see
>>             https://stat.ethz.ch/mailman/listinfo/r-help
>>             <https://stat.ethz.ch/mailman/listinfo/r-help>
>>             PLEASE do read the posting guide
>>             http://www.R-project.org/posting-guide.html
>>             <http://www.R-project.org/posting-guide.html>
>>             and provide commented, minimal, self-contained, reproducible
>>             code.
>>
>>
>>
>>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug01_right.png
Type: image/png
Size: 3913 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20180817/0f754c29/attachment-0004.png>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug01_wrong.png
Type: image/png
Size: 6462 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20180817/0f754c29/attachment-0005.png>


More information about the R-help mailing list