[R] reading data from password protected url

Duncan Temple Lang duncan at wald.ucdavis.edu
Sun Jun 26 01:16:19 CEST 2011


Hi Steve

 RCurl can help you when you need to have more control over Web requests.
The details vary from Web site to Web site and the different ways to specify
passwords, etc.

If the JSESSIONID and NCES_JSESSIONID are regular cookies and returned in the first
request as cookies, then you can just have RCurl handle the cookies
But the basics for your case are

  library(RCurl)
  h = getCurlHandle( cookiefile = "")

Then make your Web request using getURLContent(), getForm() or postForm()
but making certain to pass the curl handle  stored in h in each call, e.g.

  ans = getForm(yourURL, login = "bob", password = "jane", curl = h)

  txt = getURLContent(dataURL, curl = h)


If JSESSIONID and NCES_JSESSIONID are not returned as cookies but HTTP header fields, then you
need to process the header.
Something like

  rdr = dynCurlReader(h)

  ans = getForm(yourURL, login = "bob", password = "jane", curl = h, header = rdr$update)

Then the header  from the HTTP response is available as
  rdr$header()

and you can use parseHTTPHeader(rdr$header()) to convert it into a named vector.


 HTH,
    D.

On 6/24/11 2:12 PM, Steven R Corsi wrote:
> I am trying to retrieve data from a password protected database. I have login information and the proper url. When I
> make a request to the url, I get back some info, but need to read the "hidden header" information that has JSESSIONID
> and NCES_JSESSIONID. They need to be used to set cookies before sending off the actual url request that will result in
> the data transfer. Any help would be much appreciated.
> Thanks
> Steve
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list