[Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

Martin Maechler maechler at stat.math.ethz.ch
Thu Aug 27 17:16:53 CEST 2015


>>>>> "DM" == Duncan Murdoch <murdoch.duncan at gmail.com>
>>>>>     on Wed, 26 Aug 2015 19:07:23 -0400 writes:

    DM> On 26/08/2015 6:04 PM, Jeroen Ooms wrote:
    >> On Tue, Aug 25, 2015 at 10:33 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
    >>> 
    >>> actually I don't know that it does -- it addresses the symptom but I think there should be an error from libcurl on the 403 / 404 rather than from read.dcf on error page...
    >> 
    >> Indeed, the only correct behavior is to turn the protocol error code
    >> into an R exception. When the server returns a status code >= 400, it
    >> indicates that the request was unsuccessful and the response body does
    >> not contain the content the client had requested, but should instead
    >> be interpreted as an error message/page. Ignoring this fact and
    >> proceeding with parsing the body as usual is incorrect and leads to
    >> all kind of strange errors downstream.

    DM> Yes.  I haven't been following this long thread.  Is it only in R-devel,
    DM> or is this happening in 3.2.2 or R-patched?

    DM> If the latter, please submit a bug report.  If it is only R-devel,
    DM> please just be patient.  When R-devel becomes R-alpha next year, if the
    DM> bug still exists, please report it.

    DM> Duncan Murdoch

Probably I'm confused now...
Both R-patched and R-devel give an error (after a *long* wait!) 
for
       download.file("https://someserver.com/mydata.csv", "mydata.csv")

So that problem is I think  solved now.
Ideally, it would nice to set the *timeout* as an R function
argument ourselves.. though.

Kevin Ushey's original problem however is still in R-patched and
R-devel:

ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin", method="libcurl")
ap

giving

> ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin", method="libcurl")Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin:
  Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
> ap
     Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs
     MD5sum NeedsCompilation File Repository
> 

and the resulting 'ap' is the same as e.g., with the the default
method which also gives a warning and then an empty list (well
"data.frame") of packages.


I don't see a big problem with the above.
It would be better if the warning did not contain the extra
   "Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!"
part, but apart from that I'd say the behavior is not bogous:

We ask for the available package get as answer 'zero packages'
which is correct.



More information about the R-devel mailing list