[Rd] download.file does not process gz files correctly (truncates them?)

Henrik Bengtsson henrik@bengt@@on @ending from gm@il@com
Thu May 3 23:14:12 CEST 2018


Also, as mentioned in my
https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
not specifying the mode argument, the default on Windows is mode = "w"
*except* for certain, case-sensitive, filename extensions:

    if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$", url)))
        mode <- "wb"

Just like the need for mode = "wb" on Windows, the above
special-file-extension-hack is only happening on Windows, and is only
documented in ?download.file if you're on Windows; so someone who's on
Linux/macOS trying to help someone on Windows may not be aware of
this. This adds to even more confusions, e.g. "works for me".

/Henrik

On Thu, May 3, 2018 at 7:27 AM, Joris Meys <jorismeys at gmail.com> wrote:
> Thank you Henrik and Martin for explaining what was going on. Very
> insightful!
>
> On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms <jeroenooms at gmail.com> wrote:
>>
>> On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
>> <henrik.bengtsson at gmail.com> wrote:
>> > Use mode="wb" when you download the file. See
>> > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.
>> >
>> > R core, and others, is there a good argument for why we are not making
>> > this
>> > the default download mode? It seems like a such a simple fix to such a
>> > common "mistake".
>>
>> I'd like to second this feature request. This default behaviour is
>> unexpected and often leads to r scripts that were written on
>> mac/linux, to produce corrupted files on windows, checksum mismatches,
>> etc.
>>
>> Even for text files, the default should be to download the file as-is.
>> Trying to "fix" line-endings should be opt-in, never the default.
>> Downloading a file via a browser or ftp client on windows also doesn't
>> change the file, why should R?
>
>
> I third the feature request.
>
>>
>>
>>
>> On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch <murdoch.duncan at gmail.com>
>> wrote:
>> > Many downloads are text files (HTML, CSV, etc.), and if those are
>> > downloaded
>> > in binary, a Windows user might end up with a file that Notepad can't
>> > handle, because it would have Unix-style line endings.
>>
>> True but I don't think this is relevant. The same holds e.g. for the R
>> files in source packages, which also have unix line endings. Most
>> Windows users will use an actual editor that understands both types of
>> line endings, or can convert between the two.
>>
>> Downloading-file should do just that.
>
>
> Again, I agree. In my (limited) experience the only program that fails to
> properly display \n as a line ending, is Notepad. But it can still open the
> file regardless. If line ending conflicts cause bugs, it's almost always a
> unix-like OS struggling with Windows-style endings. I have yet to meet the
> first one the other way around.
>
> Cheers
> Joris
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> -----------
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




More information about the R-devel mailing list