[Rd] Memory error in the libcurl connection code

Gábor Csárdi c@@rd|@g@bor @end|ng |rom gm@||@com
Wed Jan 22 23:56:17 CET 2020


Hi All,

I think there is a memory error in the libcurl connection code that
typically happens when libcurl reads big chunks of data. This
potentially affects all code that use url() with the libcurl download
method, which is the default in most builds. In practice it tends to
happen more with HTTP/2 and if the connection is wrapped into a
gzcon(). macOS Catalina has a libcurl build with HTTP/2 error, so many
users that upgraded macOS are starting to see this.

The workaround is to avoid using url(), if you can. If you need an
HTTP stream, you can use curl::curl(), which is a drop-in replacement.

To reproduce, the easiest is a libcurl build that has HTTP/2 support
and a server with HTTP/2 as well, e.g. the cloud mirror:

------------------------------------------------
~ # R --slave -e 'options(internet.info = 0); foo <-
readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))'
*   Trying 13.33.54.118:443...
* TCP_NODELAY set
* Connected to cran.rstudio.com (13.33.54.118) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=cran.rstudio.com
*  start date: Jul 24 00:00:00 2019 GMT
*  expire date: Aug 24 12:00:00 2020 GMT
*  subjectAltName: host "cran.rstudio.com" matched cert's "cran.rstudio.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56303c2910e0)
> GET /src/contrib/Meta/archive.rds HTTP/2
Host: cran.rstudio.com
User-Agent: R (3.4.4 x86_64-pc-linux-gnu x86_64 linux-gnu)
Accept: */*

* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200
< content-length: 2483432
< date: Wed, 22 Jan 2020 21:22:04 GMT
< server: Apache/2.4.39 (Unix)
< last-modified: Wed, 22 Jan 2020 17:10:22 GMT
< etag: "25e4e8-59cbd998a0360"
< accept-ranges: bytes
< cache-control: max-age=1800
< expires: Wed, 22 Jan 2020 21:52:04 GMT
< x-cache: Hit from cloudfront
< via: 1.1 6cbe48f9f9ff0c768f29d83804f75d4c.cloudfront.net (CloudFront)
< x-amz-cf-pop: MAN50-C1
< x-amz-cf-id: WwCQVQz9g8ZP6Az4m4n__h7aUW6vwlg0-AkiCv_DnVfGe10bzaFtfg==
< age: 960
<
* 85 data bytes written
Error in readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))
:
  reference index out of range
* stopped the pause stream!
* Connection #0 to host cran.rstudio.com left intact
Execution halted
------------------------------------------------

Sometimes you get a crash, sometimes a corrupt stream, etc. Sometimes
is actually works.

It seems that the fix is simply this:

------------------------------------
--- src/modules/internet/libcurl.c~
+++ src/modules/internet/libcurl.c
@@ -762,6 +762,7 @@
      void *newbuf = realloc(ctxt->buf, newbufsize);
      if (!newbuf) error("Failure in re-allocation in rcvData");
      ctxt->buf = newbuf; ctxt->bufsize = newbufsize;
+    ctxt->current = ctxt->buf;
  }

  memcpy(ctxt->buf + ctxt->filled, ptr, add);
------------------------------------

Best,
Gabor



More information about the R-devel mailing list