[R] help with json data from the web into data frame in R

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Tue May 8 21:49:25 CEST 2018


> On May 8, 2018, at 10:08 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
> 
> Hi David,  .. I think I've got it :-) 
> Please let me know if you see anything glaringly wrong with this:
> 
> library(RCurl)
> zWebObj <- postForm("https://www.semantic-mediawiki.org/w/api.php",
>   "action" = "ask",
>   "query" = "[[Category:City]]|?Capital%20of|?Has%20area",
>   "format" = "json"
>   .opts = list(ssl.verifypeer = FALSE)
> )
> 

The R interpreter tells me there's a missing comma after the line: 

"format" = "json"

Fixing that syntactic error I get:

str(zWebObj)
# ---bein console output
 atomic [1:1] {"query":{"printrequests":[{"label":"","key":"","redi":"","typeid":"_wpg","mode":2}],"results":{"File:2166320938 | __truncated__
 - attr(*, "Content-Type")= Named chr [1:2] "application/json" "utf-8"
  ..- attr(*, "names")= chr [1:2] "" "charset"
#--- end console output-----
js1 <-fromJSON(zWebObj)
#--------
Error: lexical error: inside a string, '\' occurs before a character which it may not.
          title":""}},"serializer":"SMW\Serializers\QueryResultSeriali
                     (right here) ------^


I'm not really a JSON expert, so am not equipped to offer debugging assistance there.

-- 
David.


> Thank you!
> -Rich
> 
> -----Original Message-----
> From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
> Sent: Tuesday, May 08, 2018 12:51 PM
> To: David Winsemius
> Cc: r-help using r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
> 
> [non-tabular json data] -- ok.. so I think I need to figure out how to make it tabular. Thanks!
> 
> [curl] -- I was hoping there was a cleaner way to do it.. using R to evoke cURL to get the data as text and then passing it into getJSON seems to be what I need to do.
> 
> Do you by chance have an simple example of using RCurl to get a response ignoring cert errors?
> 
> ty
> -Rich
> 
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius using comcast.net] 
> Sent: Tuesday, May 08, 2018 12:25 PM
> To: Evans, Richard K. (GRC-H000)
> Cc: r-help using r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
> 
> 
>> On May 8, 2018, at 9:03 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
>> 
>> That said, I have two issues to ask for help with:
>> 
>> 1) how to ignore cert errors with a fromJSON call
> 
> If you can do it with curl, then why aren't you doing one of a) a system call, b) installing and loading RCurl, c) installing and loading curl (the R package with that name)?
> 
>> 
>> And 
>> 
>> 2) why the json data from the example link doesn't convert to a data frame.
> 
> That was already answered in my earlier response. It's not a tabular result, so it doesn't "fit" into a tabular structure.
> 
> -- 
> David.
> 
> 
>> As seen in the following example
>> 
>> library("rjson")
>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>> json_data_frame <- as.data.frame(result)
>> print(json_data_frame)
>> 
>> which results in:
>> 
>>> library("rjson")
>> 
>> Warning message:
>> package ‘rjson’ was built under R version 3.4.4 
>> 
>>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>>> json_data_frame <- as.data.frame(result)
>> 
>> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
>> arguments imply differing number of rows: 0, 1
>> 
>>> print(json_data_frame)
>> 
>> Error in print(json_data_frame) : object 'json_data_frame' not found
>> 
>>> 
>> 
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
>> Sent: Tuesday, May 08, 2018 11:52 AM
>> To: David Winsemius
>> Cc: r-help using r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>> 
>> Right. I'm trying to access a server within my organization which has a cert error that I cannot fix. 
>> 
>> The example link I provided was to a site on the web that does not have the cert error.
>> 
>> From the linux shell I use the "-k" switch with cURL to ignore cert errors.. is there an equivalent in the R world?
>> 
>> -Rich
>> 
>> 
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius using comcast.net] 
>> Sent: Tuesday, May 08, 2018 11:48 AM
>> To: Evans, Richard K. (GRC-H000)
>> Cc: Eric Berger; r-help using r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>> 
>> 
>>> On May 8, 2018, at 8:36 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
>>> 
>>> I’ve been tinkering and discovered that the link I need to read json data from is ‘https’ and there is a certificate warning that I have to click through from a browser. That might be my issue. Is there any way in the json package to tell it to ignore self-signed cert errors in a url?
>> 
>> I didn't have that issue when using the link you offered:
>> 
>> library(jsonlite)
>> myJSON <- fromJSON( url("https://www.semantic-mediawiki.org/w/api.php?action=ask&query=%5B%5BCategory:City%5D%5D&format=json") )
>> 
>> # results in a complex list (not trivially reducible to a dataframe:
>> 
>> str(myJSON)
>> List of 1
>> $ query:List of 5
>> ..$ printrequests:'data.frame':	1 obs. of  5 variables:
>> .. ..$ label : chr ""
>> .. ..$ key   : chr ""
>> .. ..$ redi  : chr ""
>> .. ..$ typeid: chr "_wpg"
>> .. ..$ mode  : int 2
>> ..$ results      :List of 39
>> .. ..$ File:2166320938 5cfc9ec72a z.jpg                      :List of 6
>> .. .. ..$ printouts   : list()
>> .. .. ..$ fulltext    : chr "File:2166320938 5cfc9ec72a z.jpg"
>> .. .. ..$ fullurl     : chr "https://www.semantic-mediawiki.org/wiki/File:2166320938_5cfc9ec72a_z.jpg"
>> #-----trimmed-----------
>> 
>> David
>> 
>>> 
>>> -Rich
>>> 
>>> From: Eric Berger [mailto:ericjberger using gmail.com]
>>> Sent: Tuesday, May 08, 2018 11:31 AM
>>> To: Evans, Richard K. (GRC-H000)
>>> Cc: r-help using r-project.org
>>> Subject: Re: [R] help with json data from the web into data frame in R
>>> 
>>> Hi Rich,
>>> Take a look at the function fromJSON found in the rjson package.
>>> Note that the Usage in the help page: ?fromJSON names the second 
>>> argument 'file' but if you look at the description the argument can be a URL.
>>> 
>>> HTH,
>>> Eric
>>> 
>>> 
>>> On Tue, May 8, 2018 at 6:16 PM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov<mailto:richard.k.evans using nasa.gov>> wrote:
>>> Hello
>>> 
>>> I am able to construct a url that points to some data online in the JSON format.  See an example at [0].
>>> 
>>> I would like to work with this data as a dataframe in R.
>>> 
>>> I know that there is a package for handling json data [1] but it assumes the data is in a local file but It is not clear to me how to request the data from the web in an R script and get the json data converted into a data frame in R.
>>> 
>>> Can anyone provide a basic example or some guidance please?
>>> 
>>> -Rich (revansx)
>>> 
>>> [0] 
>>> https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Catego
>>> ry:City]]&format=json<https://www.semantic-mediawiki.org/w/api.php?act
>>> ion=ask&query=%5b%5bCategory:City%5d%5d&format=json>
>>> [1] https://www.tutorialspoint.com/r/r_json_files.htm
>>> 
>>> ______________________________________________
>>> R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To 
>>> UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> 
>>> 	[[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> 'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law
>> 
>> 
>> 
>> 
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law
> 
> 
> 
> 
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law




More information about the R-help mailing list