[Rd] download.file does not process gz files correctly (truncates them?)

Joris Meys jori@mey@ @ending from gm@il@com
Mon May 7 10:49:00 CEST 2018


Martin, also from me a heartfelt thank you for taking care of this. Some
thoughts on Henrik's response:

On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson <henrik.bengtsson at gmail.com
> wrote:

>
> I still argue that the current behavior cause more harm than it helps.
>

I agree with your analysis of the problems this legacy behaviour causes.

Deprecating the default mode="w" on Windows can be done in steps, e.g.
> by making the argument mandatory for a while. This could be done on
> all platforms because we're already all affected, i.e. we need to
> specify 'mode' to avoid surprises.
>

That sounds like a reasonable way to move away from this discrepancy
between OS.


> What about case-insensitive matching, e.g. data.ZIP and data.Rdata?
>

Totally agree, and easily solved by eg adding ignore.case = TRUE to the
grep() call.


> A quick scan of the R source code suggests that R is also working with
> the following filename extensions (using various case styles):
>
> What about all the other file extensions that we know for sure are binary?
>

If the default isn't changed, doesn't it make more sense to actually turn
the logic around? Text files that are downloaded over the internet are
almost always .txt, .csv, or a few other extensions used for text data .
Those are actually the only files where some people with very old Windows
programs for text processing can get into trouble. So instead of adding
every possible binary extension, one can put "wb" as default and change to
"w" if it is a text file instead of the other way around. That would not
change the concept of the behaviour, but ensures that the function doesn't
fail to detect a binary file. Not detecting a text file is far less of a
problem, as not converting the line endings doesn't destruct the file.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-devel mailing list