[R] string size limits in RCurl

Elmore, Ryan Ryan.Elmore at nrel.gov
Wed Apr 24 18:45:58 CEST 2013


Hi All,

I am running into what appears to be character size limit in a JSON string when trying retrieve data from either `curlPerform()` or `getURL()`.  Here is non-reproducible code [1], but it should shed some light on the problem.

    # Note that .base.url is the basic url for the API, q is a query, user
    #  is specified, etc.
    session = getCurlHandle()
    curl.opts <- list(userpwd = paste(user, ":", key, sep = ""),
                      httpheader = "Content-Type: application/json")
    request <- paste(.base.url, q, sep = "")
    txt <- getURL(url = request, curl = session, .opts = curl.opts,
                  write = basicTextGatherer())

or

    r = dynCurlReader()
    curlPerform(url = request, writefunction = r$update, curl = session,
                .opts = curl.opts)

My guess is that the `update` or `value` functions in the `basicTextGather` or `dynCurlReader` text handler objects are having trouble with the large strings.  In this example, `r$value()` will return a truncated string that is approximately 2 MB.  The code given above will work fine for queries < 2 MB.

Note that I can easily do the following from the command line (or using `system()` in R), but writing to disc seems like a waste if I am doing the subsequent analysis in R.

    curl -v --header "Content-Type: application/json" --user username:register:passwd https://base.url.for.api/getdata/select+*+from+sometable > stream.json

where `file.json` is a roughly 14MB json string. I can read the string into R using either

    con <- file(paste(.project.path, "data/stream.json", sep = ""), "r")
    string <- readLines(con)

or directly to list as

    tmp <- fromJSON(file = paste(.project.path, "data/stream.json", sep = ""))

Any thoughts are very much appreciated.  Note that I posted this same question/comment to StackOverflow and will happily provide any helpful suggestions to that list as well.

Ryan

[1] - Sorry for not providing reproducible code, but I'm dealing with a govt firewall.



More information about the R-help mailing list