[R] Download data from NASA for multiple locations - RCurl

Miluji Sb milujisb at gmail.com
Mon Oct 16 00:35:36 CEST 2017


Dear David,

This is amazing, thank you so much. If I may ask another question:

The output looks like the following:

###
dput(head(x,15))
c("Metadata for Requested Time Series:", "",
"prod_name=GLDAS_NOAH025_3H_v2.0",
"param_short_name=Tair_f_inst", "param_name=Near surface air temperature",
"unit=K", "begin_time=1970-01-01T00", "end_time=1979-12-31T21",
"lat= 42.36", "lon=-71.06", "Request_time=2017-10-15 22:20:03 GMT",
"", "Date&Time               Data", "1970-01-01T00:00:00\t267.769",
"1970-01-01T03:00:00\t264.595")
###

Thus I need to drop the first 13 rows and do the following to add
identifying information:

###
mydata <- data.frame(year = substr(x,1,4),
                     month = substr(x, 6,7),
                     day = substr(x, 9, 10),
                     hour = substr(x, 12, 13),
                     temp = substr(x, 21, 27))

mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###

Is it possible to incorporate these into your code so the data looks like
this:

dput(droplevels(head(mydata)))
structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "1970",
class = "factor"),
    month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class =
"factor"),
    day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class =
"factor"),
    hour = structure(1:6, .Label = c("00", "03", "06", "09",
    "12", "15"), class = "factor"), temp = structure(c(6L, 4L,
    2L, 1L, 3L, 5L), .Label = c("261.559", "262.525", "262.648",
    "264.595", "265.812", "267.769"), class = "factor"), city =
structure(c(1L,
    1L, 1L, 1L, 1L, 1L), .Label = "Boston", class = "factor"),
    state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = " MA ", class =
"factor"),
    lon = c(-71.06, -71.06, -71.06, -71.06, -71.06, -71.06),
    lat = c(42.36, 42.36, 42.36, 42.36, 42.36, 42.36)), .Names = c("year",
"month", "day", "hour", "temp", "city", "state", "lon", "lat"
), row.names = c(NA, 6L), class = "data.frame")

Apologies for asking repeated questions and thank you again!

Sincerely,

Milu

On Sun, Oct 15, 2017 at 11:45 PM, David Winsemius <dwinsemius at comcast.net>
wrote:

>
> > On Oct 15, 2017, at 2:02 PM, Miluji Sb <milujisb at gmail.com> wrote:
> >
> > Dear all,
> >
> > i am trying to download time-series climatic data from GES DISC (NASA)
> > Hydrology Data Rods web-service. Unfortunately, no wget method is
> > available.
> >
> > Five parameters are needed for data retrieval: variable, location,
> > startDate, endDate, and type. For example:
> >
> > ###
> > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/
> timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:
> Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&
> location=GEOM:POINT(-71.06,%2042.36)&type=asc2
> > ###
> >
> > In this case, variable: Tair_f_inst (temperature), location: (-71.06,
> > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type:
> asc2
> > (output 2-column ASCII).
> >
> > I am trying to download data for 100 US cities, data for which I have in
> > the following data.frame:
> >
> > ###
> > cities <-  dput(droplevels(head(cities, 5)))
> > structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport",
> > "Cambridge", "Fall River", "Hartford"), class = "factor"), state =
> > structure(c(2L,
> > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"),
> >    lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
> >    41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state",
> > "lon", "lat"), row.names = c(NA, 5L), class = "data.frame")
> > ###
> >
> > Is it possible to download the data for the multiple locations
> > automatically (e.g. RCurl) and save them as csv? Essentially, reading
> > coordinates from the data.frame and entering it in the URL.
> >
> > I would also like to add identifying information to each of the data
> files
> > from the cities data.frame. I have been doing the following for a single
> > file:
>
> Didn't seem that difficult:
>
> library(downloader)  # makes things easier for Macs, perhaps not needed
> # if not used will need to use download.file
>
> for( i in 1:5) {
>   target1 <- paste0("https://hydro1.gesdisc.eosdis.nasa.gov/daac-
> bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_
> 3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-
> 31T00&location=GEOM:POINT(",
>                      cities[i, "lon"],
>                      ",%20", cities[i,"lat"],
>                      ")&type=asc2")
>   target2 <- paste0("~/",    # change for whatever destination directory
> you may prefer.
>                     cities[i,"city"],
>                     cities[i,"state"], ".asc")
>   download(url=target1, destfile=target2)
>                 }
>
> Now I have 5 named files with extensions ".asc" in my user directory
> (since I'm on a Mac). It is a slow website so patience is needed.
>
> --
> David
>
>
> >
> > ###
> > x <- readLines(con=url("
> > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/
> timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:
> Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&
> location=GEOM:POINT(-71.06,%2042.36)&type=asc2
> > "))
> > x <- x[-(1:13)]
> >
> > mydata <- data.frame(year = substr(x,1,4),
> >                     month = substr(x, 6,7),
> >                     day = substr(x, 9, 10),
> >                     hour = substr(x, 12, 13),
> >                     temp = substr(x, 21, 27))
> >
> > mydata$city <- rep(cities[1,1], nrow(mydata))
> > mydata$state <- rep(cities[1,2], nrow(mydata))
> > mydata$lon <- rep(cities[1,3], nrow(mydata))
> > mydata$lat <- rep(cities[1,4], nrow(mydata))
> > ###
> >
> > Help and advice would be greatly appreciated. Thank you!
> >
> > Sincerely,
> >
> > Milu
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list