[R] reading text files directly into program from net

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Sun Sep 25 11:08:41 CEST 2022


On Sun, 25 Sep 2022 09:53:39 +0100
Nick Wray <nickmwray using gmail.com> wrote:

> The first station in this dataset has the name  00265_mertoun
> <https://data.ceda.ac.uk/badc/ukmo-midas-open/data/uk-daily-rain-obs/dataset-version-201901/berwickshire/00265_mertoun>
> which is a code and location name, again for example, and inside is a
> text file

Following the link to the file
<https://dap.ceda.ac.uk/badc/ukmo-midas-open/data/uk-daily-rain-obs/dataset-version-201901/berwickshire/00265_mertoun/midas-open_uk-daily-rain-obs_dv-201901_00265_mertoun_capability.csv?download=1>,
I get a login prompt. Same thing probably happens to R when it tries to
download those files.

Does CEDA Archive have an API for programmatic access? If not, you'll
either have to export the cookies from your browser and use the curl
package to send HTTP requests with those included, or use the developer
toolbar in your browser to find out how the login request is sent and
use the curl package to (1) send the login request, (2) receive cookies
and (3) use those cookies to download files. This is called "website
scraping" and may be brittle, depending on how much the website
administrators dislike bots.

Looking at the documentation, it seems that the datasets may be
available via FTP: https://help.ceda.ac.uk/article/280-ftp

It should be possible to use the curl package to download the files.
Depending on how R is built, it could also be possible to feed the FTP
URL directly to read.csv, if you put the username and the password
inside it: ftp://username:password@ftp-server.hostname/path/to/file.csv

-- 
Best regards,
Ivan



More information about the R-help mailing list