[Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

Kevin Ushey kevinushey at gmail.com
Tue Aug 25 23:41:28 CEST 2015


In fact, this does reproduce on R-devel:

    > options(download.file.method = "libcurl")
    > options(repos = c(CRAN = "https://cran.rstudio.com/", CRANextra =
    + "http://www.stats.ox.ac.uk/pub/RWin"))
    > install.packages("lattice") ## could be any package
    Installing package into ‘/Users/kevinushey/Library/R/3.3/library’
(as ‘lib’ is unspecified)
    Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!

    > sessionInfo()
    R Under development (unstable) (2015-08-14 r69078)
    Platform: x86_64-apple-darwin13.4.0 (64-bit)
    Running under: OS X 10.10.4 (Yosemite)

I think this could be problematic for users with custom CRAN
repositories. For example, if I have a CRAN repository that only
serves source packages (no binary packages), this implies that any R
session configured to download binary packages would fail to download
any packages at all (as it would barf on attempting to read the
non-existent PACKAGES file for the 'binary' branch of the custom
repository).

This can also be seen by attempting to install a package using current
R-devel (since no binaries are made available for R 3.3):

    > options(download.file.method = "libcurl")
    > options(repos = c(CRAN = "https://cran.rstudio.com/"))
    > print(getOption("pkgType"))
    [1] "both"
    > install.packages("lattice")
    Installing package into ‘/Users/kevinushey/Library/R/3.3/library’
    (as ‘lib’ is unspecified)
    Error in install.packages : Line starting '<!DOCTYPE HTML PUBLI
...' is malformed!

The same error (with a different, XML response) is returned when using
e.g. `https://cran.fhcrc.org`.

Kevin

On Tue, Aug 25, 2015 at 1:33 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
> On 08/25/2015 01:30 PM, Kevin Ushey wrote:
>>
>> Hi Martin,
>>
>> Indeed it does (and I should have confirmed myself with R-patched and
>> R-devel
>> before posting...)
>
>
> actually I don't know that it does -- it addresses the symptom but I think
> there should be an error from libcurl on the 403 / 404 rather than from
> read.dcf on error page...
>
> Martin
>
>
>>
>> Thanks, and sorry for the noise.
>> Kevin
>>
>>
>> On Tue, Aug 25, 2015, 13:11 Martin Morgan <mtmorgan at fredhutch.org
>> <mailto:mtmorgan at fredhutch.org>> wrote:
>>
>>     On 08/25/2015 12:54 PM, Kevin Ushey wrote:
>>      > Hi all,
>>      >
>>      > The following fails for me (on OS X, although I imagine it's the
>> same
>>      > on other platforms using libcurl):
>>      >
>>      >      options(download.file.method = "libcurl")
>>      >      options(repos = c(CRAN = "https://cran.rstudio.com/",
>> CRANextra =
>>      > "http://www.stats.ox.ac.uk/pub/RWin"))
>>      >      install.packages("lattice") ## could be any package
>>      >
>>      > gives me:
>>      >
>>      >      > options(download.file.method = "libcurl")
>>      >      > options(repos = c(CRAN = "https://cran.rstudio.com/",
>> CRANextra
>>      > = "http://www.stats.ox.ac.uk/pub/RWin"))
>>      >      > install.packages("lattice") ## coudl be any package
>>      >      Installing package into
>> ‘/Users/kevinushey/Library/R/3.2/library’
>>      >      (as ‘lib’ is unspecified)
>>      >      Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>>      >
>>      > This seems to come from a call to `available.packages()` to a URL
>> that
>>      > doesn't exist on the server (likely when querying PACKAGES on the
>>      > CRANextra repo)
>>      >
>>      > Eg.
>>      >
>>      >      > URL <- "http://www.stats.ox.ac.uk/pub/RWin"
>>      >      > available.packages(URL, method = "internal")
>>      >      Warning: unable to access index for repository
>>      > http://www.stats.ox.ac.uk/pub/RWin
>>      >           Package Version Priority Depends Imports LinkingTo
>> Suggests
>>      > Enhances License License_is_FOSS
>>      >          License_restricts_use OS_type Archs MD5sum
>> NeedsCompilation
>>      > File Repository
>>      >      > available.packages(URL, method = "libcurl")
>>      >      Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>>      >
>>      > It looks like libcurl downloads and retrieves the 403 page itself,
>>      > rather than reporting that it was actually forbidden, e.g.:
>>      >
>>      >      >
>>
>> download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz",
>>      > tempfile(), method = "libcurl")
>>      >      trying URL
>>
>> 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz'
>>      >      Content type 'text/html; charset=iso-8859-1' length 339 bytes
>>      >      ==================================================
>>      >      downloaded 339 bytes
>>      >
>>      > Using `method = "internal"` gives an error related to the inability
>> to
>>      > access that URL due to the HTTP status 403.
>>      >
>>      > The overarching issue here is that package installation shouldn't
>> fail
>>      > even if libcurl fails to access one of the repositories set.
>>      >
>>
>>     With
>>
>>       > R.version.string
>>     [1] "R version 3.2.2 Patched (2015-08-25 r69179)"
>>
>>     the behavior is to warn with an indication of the repository for which
>> the
>>     problem occurs
>>
>>       > URL <- "http://www.stats.ox.ac.uk/pub/RWin"
>>       > available.packages(URL, method="libcurl")
>>     Warning: unable to access index for repository
>>     http://www.stats.ox.ac.uk/pub/RWin:
>>         Line starting '<!DOCTYPE HTML PUBLI ...' is malformed!
>>            Package Version Priority Depends Imports LinkingTo Suggests
>> Enhances
>>            License License_is_FOSS License_restricts_use OS_type Archs
>> MD5sum
>>            NeedsCompilation File Repository
>>       > available.packages(URL, method="internal")
>>     Warning: unable to access index for repository
>>     http://www.stats.ox.ac.uk/pub/RWin:
>>         cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES'
>>            Package Version Priority Depends Imports LinkingTo Suggests
>> Enhances
>>            License License_is_FOSS License_restricts_use OS_type Archs
>> MD5sum
>>            NeedsCompilation File Repository
>>
>>     Does that work for you / address the problem?
>>
>>     Martin
>>
>>      >> sessionInfo()
>>      > R version 3.2.2 (2015-08-14)
>>      > Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>      > Running under: OS X 10.10.4 (Yosemite)
>>      >
>>      > locale:
>>      > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>>      >
>>      > attached base packages:
>>      > [1] stats     graphics  grDevices utils     datasets  methods
>> base
>>      >
>>      > other attached packages:
>>      > [1] testthat_0.8.1.0.99  knitr_1.11           devtools_1.5.0.9001
>>      > [4] BiocInstaller_1.15.5
>>      >
>>      > loaded via a namespace (and not attached):
>>      >   [1] httr_1.0.0     R6_2.0.0.9000  tools_3.2.2    parallel_3.2.2
>>     whisker_0.3-2
>>      >   [6] RCurl_1.95-4.1 memoise_0.2.1  stringr_0.6.2  digest_0.6.4
>>       evaluate_0.7.2
>>      >
>>      > Thanks,
>>      > Kevin
>>      >
>>      > ______________________________________________
>>      > R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>>      >
>>
>>
>>     --
>>     Computational Biology / Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N.
>>     PO Box 19024 Seattle, WA 98109
>>
>>     Location: Arnold Building M1 B861
>>     Phone: (206) 667-2793
>>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793



More information about the R-devel mailing list