[Rd] download.file does not process gz files correctly (truncates them?)

Martin Maechler m@echler @ending from @t@t@m@th@ethz@ch
Fri May 4 10:18:29 CEST 2018


>>>>> Joris Meys <jorismeys at gmail.com>
>>>>>     on Fri, 4 May 2018 10:00:07 +0200 writes:

    > On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera
    > <tomas.kalibera at gmail.com> wrote:

    >> The current heuristic/hack is in line with the
    >> compatibility approach: it detects files that are
    >> obviously binary, so it changes the default behavior only
    >> for cases when it would obviously cause damage.
    >> 
    >> Tomas


    > Well, I was trying to download a .gz file and
    > download.file() didn't detect that. Reason for that is
    > obviously that the link doesn't contain .gz but %2Egz ,
    > using the ASCII code for the dot instead of the dot
    > itself. That's general practice in a lot of links.

    > Hence I propose to change the line in download.file() that
    > does this check to:

    >   if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
    >       URLdecode(url))))

    > using URLdecode() ensures that .gz, .RData etc will be
    > detected correctly in an encoded URL.

    > Cheers Joris

Makes sense to me and I plan to add it when also adding '.rds'

{ OTOH, after reading the thread about this: Shouldn't you make
  your code more robust and use   mode = "wb" (or "ab") in any case?
  ;-)
}
 
Martin




More information about the R-devel mailing list