[R] html into R

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri Aug 26 15:26:10 CEST 2022


Hello,

You are right, I haven't assigned the return value.
Start the pipe with something like

RiverTweed <- page |>
   rest_of_pipe


If you have more files to download and process, post an example of 2 or 
3 links and I'll see if it can be automated.

Also posting to R-help.


Hope this helps,

Rui Barradas



Às 14:18 de 26/08/2022, Nick Wray escreveu:
> Hi Rui again sorry to have to ask this but although your code prints out a
> tibble I can't seem to be able to identify it ie find its name  I assumed
> that it's "y" but outside of your code R tells me that y is not found.
> I've tried various things but nothing gives me the tibble as an object with
> a name that I can use...?   Thanks Nick
> 
> On Fri, 26 Aug 2022 at 13:37, Nick Wray <nickmwray using gmail.com> wrote:
> 
>> Hi Rui That is brilliant   Thanks v much - what is even better is that I
>> have loads of data from different years, rivers and stations to download,
>> each of which entails a different set of numerical inputs and was thinking
>> about how I could loop through the URL with different inputs to that - but
>> by using paste I can create all the links I need   Thanks again Nick
>>
>> On Fri, 26 Aug 2022 at 11:57, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>>> Sorry, there's simpler code. I used html_elements (plural) and the
>>> result is a list. Use html_element (singular) and the output is a tibble.
>>>
>>>
>>> page |>
>>>     html_element("table") |>
>>>     html_table(header = TRUE) |>
>>>     (\(x) {
>>>       hdr <- unlist(x[3, ])
>>>       y <- x[-(1:3), ]
>>>       names(y) <- hdr
>>>       y
>>>     })()
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Às 11:53 de 26/08/2022, Rui Barradas escreveu:
>>>> Hello,
>>>>
>>>> You can try the following. It worked with me.
>>>> Read from the link and post-process the html data extracting the
>>> element
>>>> "table" and then the table itself.
>>>>
>>>> This table has 3 rows before the actual table so the lapply below will
>>>> get the table and its header.
>>>>
>>>>
>>>> library(httr)
>>>> library(rvest)
>>>>
>>>>
>>>> link <-
>>>> "
>>> https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code"
>>>
>>>>
>>>>
>>>> page <- read_html(link)
>>>> page |>
>>>>     html_elements("table") |>
>>>>     html_table(header = TRUE) |>
>>>>     lapply(\(x) {
>>>>       hdr <- unlist(x[3, ])
>>>>       y <- x[-(1:3), ]
>>>>       names(y) <- hdr
>>>>       y
>>>>     })
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>> Às 09:43 de 26/08/2022, Nick Wray escreveu:
>>>>> hello - I need to download flow data for Scottish river catchments.
>>> The
>>>>> data is available from the Scottish Environmental protection Agency
>>> body
>>>>> and that doesn't present a problem.  For example the API beneath will
>>>>> access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one
>>>>> station:
>>>>>
>>>>>
>>> https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code
>>>>>
>>>>>
>>>>>
>>>>> But this data comes as HTML.  I can copy and paste it into a text doc
>>>>> which
>>>>> can then be read into R but that's slow and time-consuming.  I have
>>> tried
>>>>> using the package "rvest" to import the HTML into R but I have got
>>>>> nowhere.
>>>>>
>>>>> Can anyone give me any pointers as to how to do this?
>>>>>
>>>>>
>>>>> Thanks Nick Wray
>>>>>
>>>>>      [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>



More information about the R-help mailing list