[R] google login via RCurl

Hiroyuki Kawakatsu hkawakat at gmail.com
Thu Feb 9 07:24:47 CET 2012


Hi,

Can anyone manage to login to a google account via RCurl? All info on
the web appears to be out of date.

(1) both RGoogleDocs and RGoogleTrends on omegahat appears to be withdrawn:
http://www.omegahat.org/RGoogleDocs/
http://www.omegahat.org/RGoogleTrends/
Does anyone know why?

(2) The closest I can get is based on code from
http://www.stanford.edu/~knoepfle/cgi-bin/flatpress/?x=entry:entry101220-023915

When I try this, I get back a page saying `Your browser's cookie
functionality is turned off. Please turn it on' (both on windoze and
linux). I find that the cookie file specified in
curlSetOpt(cookiefile=) does not always get created but
curlSetOpt(verbose=TRUE) shows lines such as

* Added cookie GAPS=xxx for domain www.google.com, path /accounts,
expire 1391784478
< Set-Cookie: GAPS=xxx;Path=/accounts;Expires=Fri, 07-Feb-2014
14:47:58 GMT;Secure;HttpOnly

I am behind a firewall but the verbose trace seems to indicate that
curl is going through the proxy properly.

(3) A similar problem is asked and solved in php code at
http://stackoverflow.com/questions/8991873/login-to-google-with-php-and-curl-cookie-turned-off
As far as I (with no experience in php) can tell, the R code in (2)
does the same as the `answer' code in (3). Can anyone see the
difference? One difference is that (2) only grabs the input field GALX
but (3) seems to grab and post all fields. Even when I grep and post
all input fields, the result is the same no-cookie-functionality-page.

(4) My objective is to scrape the search index data as .csv file from
http://www.google.com/insights/search/
as in (2). (I can manually download the data in a web browser.) If
there are other commandline tools (than RCurl) that can fake a browser
to download these data, I would like to know.

Thanks for any help/hints,
h.
-- 
+---
| Hiroyuki Kawakatsu
| Business School, Dublin City University
| Dublin 9, Ireland. Tel +353 (0)1 700 7496



More information about the R-help mailing list