[Rd] Memory error in the libcurl connection code

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Jan 23 09:49:36 CET 2020


>>>>> Gábor Csárdi 
>>>>>     on Wed, 22 Jan 2020 22:56:17 +0000 writes:

    > Hi All,
    > I think there is a memory error in the libcurl connection code that
    > typically happens when libcurl reads big chunks of data. This
    > potentially affects all code that use url() with the libcurl download
    > method, which is the default in most builds. In practice it tends to
    > happen more with HTTP/2 and if the connection is wrapped into a
    > gzcon(). macOS Catalina has a libcurl build with HTTP/2 error, so many
    > users that upgraded macOS are starting to see this.

    > The workaround is to avoid using url(), if you can. If you need an
    > HTTP stream, you can use curl::curl(), which is a drop-in replacement.

    > To reproduce, the easiest is a libcurl build that has HTTP/2 support
    > and a server with HTTP/2 as well, e.g. the cloud mirror:

    > ------------------------------------------------
    > ~ # R --slave -e 'options(internet.info = 0); foo <-
    > readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))'
    > *   Trying 13.33.54.118:443...
    > * TCP_NODELAY set
    > * Connected to cran.rstudio.com (13.33.54.118) port 443 (#0)
    > * ALPN, offering h2
    > * ALPN, offering http/1.1
    > * successfully set certificate verify locations:
    > *   CAfile: /etc/ssl/certs/ca-certificates.crt
    > CApath: none
    > * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
    > * ALPN, server accepted to use h2
    > * Server certificate:
    > *  subject: CN=cran.rstudio.com
    > *  start date: Jul 24 00:00:00 2019 GMT
    > *  expire date: Aug 24 12:00:00 2020 GMT
    > *  subjectAltName: host "cran.rstudio.com" matched cert's "cran.rstudio.com"
    > *  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
    > *  SSL certificate verify ok.
    > * Using HTTP2, server supports multi-use
    > * Connection state changed (HTTP/2 confirmed)
    > * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
    > * Using Stream ID: 1 (easy handle 0x56303c2910e0)
    >> GET /src/contrib/Meta/archive.rds HTTP/2
    > Host: cran.rstudio.com
    > User-Agent: R (3.4.4 x86_64-pc-linux-gnu x86_64 linux-gnu)
    > Accept: */*

    > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
    > < HTTP/2 200
    > < content-length: 2483432
    > < date: Wed, 22 Jan 2020 21:22:04 GMT
    > < server: Apache/2.4.39 (Unix)
    > < last-modified: Wed, 22 Jan 2020 17:10:22 GMT
    > < etag: "25e4e8-59cbd998a0360"
    > < accept-ranges: bytes
    > < cache-control: max-age=1800
    > < expires: Wed, 22 Jan 2020 21:52:04 GMT
    > < x-cache: Hit from cloudfront
    > < via: 1.1 6cbe48f9f9ff0c768f29d83804f75d4c.cloudfront.net (CloudFront)
    > < x-amz-cf-pop: MAN50-C1
    > < x-amz-cf-id: WwCQVQz9g8ZP6Az4m4n__h7aUW6vwlg0-AkiCv_DnVfGe10bzaFtfg==
    > < age: 960
    > <
    > * 85 data bytes written
    > Error in readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))
    > :
    > reference index out of range
    > * stopped the pause stream!
    > * Connection #0 to host cran.rstudio.com left intact
    > Execution halted
    > ------------------------------------------------

    > Sometimes you get a crash, sometimes a corrupt stream, etc. Sometimes
    > is actually works.

    > It seems that the fix is simply this:

    > ------------------------------------
    > --- src/modules/internet/libcurl.c~
    > +++ src/modules/internet/libcurl.c
    > @@ -762,6 +762,7 @@
    > void *newbuf = realloc(ctxt->buf, newbufsize);
    > if (!newbuf) error("Failure in re-allocation in rcvData");
    ctxt-> buf = newbuf; ctxt->bufsize = newbufsize;
    > +    ctxt->current = ctxt->buf;
    > }

    > memcpy(ctxt->buf + ctxt->filled, ptr, add);
    > ------------------------------------

    > Best,
    > Gabor

Thanks a lot, Gábor!

I can reproduce the problem (on Linux Fedora 30) and confirm
that your patch works.

Even more, the patch looks  "almost obvious",
because
	ctxt->current = ctxt->buf

happens earlier in rcvData() after a change to ctxt->buf  and so
should be updated if buf is.

An even slightly "better" patch just moves that statement down
to after the  if(add) { .. }  clause.

I'll patch the sources, and will port to 'R 3.6.2 patched'.

Martin



More information about the R-devel mailing list