[Rd] encoding argument of source() in 3.5.0

Tomas Kalibera tom@@@k@liber@ @ending from gm@il@com
Tue Jun 5 16:03:54 CEST 2018


Thanks for the report, fixed in R-devel (74848).

Best
Tomas

On 06/04/2018 02:41 PM, NELSON, Michael wrote:
>
> On R 3.5.0 (Mac)
>
> The issue appears when using the default (libcurl) method and specifying the encoding
>
> Note that using method='internal' causes a segfault if used in conjunction with encoding. (and works when encoding is not set)
>
> urlR <- "http://home.versanet.de/~s-berman/source2.R"
> # works
> url_default <- url(urlR)
> scan(url_default, "")
> # Read 7 items
> # [1] "source.test2"       "<-"                 "function()"         "{"                  "print(\"Non-ascii:" "äöüß\")"
> # [7] "}"
>
> url_default_en <- url(urlR, encoding = "UTF-8")
> scan(url_default_en, "")
> # Read 0 items
> # character(0)
> url_internal <- url(urlR, method = 'internal')
> scan(url_internal, "")
> # Read 7 items
> # [1] "source.test2"       "<-"                 "function()"         "{"                  "print(\"Non-ascii:" "äöüß\")"
> # [7] "}"
>
> url_internal_en <- url(urlR, encoding = "UTF-8", method = 'internal')
> #scan(url_internal_en, "")
> #*** caught segfault ***
> #  address 0x0, cause 'memory not mapped'
>
> url_libcurl <- url(urlR, method = 'libcurl')
> scan(url_libcurl, "")
> # Read 7 items
> # [1] "source.test2"       "<-"                 "function()"         "{"                  "print(\"Non-ascii:" "äöüß\")"
> # [7] "}"
> url_libcurl_en <- url(urlR, encoding = "UTF-8", method = 'libcurl')
> scan(url_libcurl_en, "")
> # Read 0 items
> # character(0)
>
>
> Michael
>
> ________________________________________
> From: R-devel [r-devel-bounces using r-project.org] on behalf of Stephen Berman [stephen.berman using gmx.net]
> Sent: Monday, 4 June 2018 7:26 PM
> To: Martin Maechler
> Cc: R-devel
> Subject: Re: [Rd] encoding argument of source() in 3.5.0
>
> On Mon, 4 Jun 2018 10:44:11 +0200 Martin Maechler <maechler using stat.math.ethz.ch> wrote:
>
>>>>>>> peter dalgaard
>>>>>>>      on Sun, 3 Jun 2018 23:51:24 +0200 writes:
>>      > Looks like this actually comes from readLines(), nothing
>>      > to do with source() as such: In current R-devel (still):
>>
>>      >> f <- file("http://home.versanet.de/~s-berman/source2.R", encoding="UTF-8")
>>      >> readLines(f)
>>      > character(0)
>>      >> close(f)
>>      >> f <- file("http://home.versanet.de/~s-berman/source2.R")
>>      >> readLines(f)
>>      > [1] "source.test2 <- function() {"   "    print(\"Non-ascii: äöüß\")"
>>      > [3] "}"
>>
>>      > -pd
>>
>> and that's not even readLines(), but rather how exactly the
>> connection is defined [even in your example above]
>>
>>    > urlR <- "http://home.versanet.de/~s-berman/source2.R"
>>    > readLines(urlR, encoding="UTF-8")
>>    [1] "source.test2 <- function() {"   "    print(\"Non-ascii: äöüß\")"
>>    [3] "}"
>>    > f <- file(urlR, encoding = "UTF-8")
>>    > readLines(f)
>>    character(0)
>>
>> and the same behavior with scan()  instead of readLines() :
>>
>>> scan(urlR,"") # works
>> Read 7 items
>> [1] "source.test2"       "<-"                 "function()"         "{"
>> [5] "print(\"Non-ascii:" "äöüß\")"            "}"
>>> scan(f,"") # fails
>> Read 0 items
>> character(0)
>> So it seems as if the bug is in the file() [or url()] C code ..
> Yes, the problem seems to be restricted to loading files from a
> (non-local) URL; i.e. this works fine on my computer:
>
>    > source("file:///home/steve/prog/R/source2.R", encoding="UTF-8")
>
> Also, I noticed this works too:
>
>    > read.table("http://home.versanet.de/~s-berman/table2", encoding="UTF-8", skip=1)
>
> where (if I read the source correctly) using `skip=1' makes read.table()
> call readLines().  (The read.table() invocation also works without
> `skip'.)
>
>> But then we also have to consider Windows .. where I think most changes have
>> happened during the  R-3.4.4 --> R-3.5.0  transition.
> Yes, please.  I need (or at least it would be convenient) to be able to
> load R code containing non-ascii characters from the web under
> MS-Windows.
>
> Steve Berman
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> __________________________________________________________________________________________________________
> This email has been scanned for the NSW Ministry of Health by the Websense Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW Ministry of health's Electronic Messaging Policy.
> __________________________________________________________________________________________________________
>
> _______________________________________________________________________________________________________
> Disclaimer: This message is intended for the addressee named and may contain confidential information.
> If you are not the intended recipient, please delete it and notify the sender.
> Views expressed in this message are those of the individual sender, and are not necessarily the views of the NSW Ministry of Health.
> _______________________________________________________________________________________________________
> This email has been scanned for the NSW Ministry of Health by the Websense Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW Ministry of Health's Electronic Messaging Policy.
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list