[R] Parsing Files in R (USGS StreamFlow data)

Mon Oct 5 07:12:34 CEST 2009

Thanks for the help - this was my goal sorry for not being straight
forward enough

#021973269 is the Waynesboro Gauge on the Savannah River Proper (SRS)
#02102908 is the Flat Creek Gauge (ftbrfcms)
#02133500 is the Drowning Creek (ftbrbmcm)
#02341800 is the Upatoi Creek Near Columbus (ftbn)
#02342500 is the Uchee Creek Near Fort Mitchell (ftbn)
#02203000 is the Canoochee River Near Claxton (ftst)

L <- readLines("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=60&site_no=021973269,02102908,02133500,02341800,02342500,02203000")

#look for the data with USGS in front of it (this take advantage of
the agency column)
L.USGS <- grep("^USGS", L, value = TRUE)
DF <- read.table(textConnection(L.USGS), fill = TRUE)
DF <- data.frame(DF)

DF[DF==21973269] <- "Waynesboro Gauge on the Savannah River Proper (SRS)"
DF[DF==2102908] <- "Flat Creek Gauge (ftbrfcms)"
DF[DF==2133500] <- "Drowning Creek (ftbrbmcm)"
DF[DF==2341800] <- "Upatoi Creek Near Columbus (ftbn)"
DF[DF==2342500] <- "Uchee Creek Near Fort Mitchell (ftbn)"
DF[DF==2203000] <- "Canoochee River Near Claxton (ftst)"

colnames(DF) <- c("agency", "gauge", "date", "time", "gauge_height",
"discharge", "precipitation")

dts <- as.character(DF[,"date"])
tms <- as.character(DF[,"time"])
date_time <- as.chron(paste(dts, tms), "%Y-%m-%d %H:%M")
DF <- data.frame(date_time, DF)
library(ggplot2)
qplot(as.POSIXct(date_time), discharge, data=DF,
geom="line")+facet_wrap(~gauge,
scales="free_y")+coord_trans(y="log10")

On Sun, Oct 4, 2009 at 9:20 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
>
> On 5/10/2009, at 2:49 PM, stephen sefick wrote:
>
>> http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269
>>
>> I would like to be able to parse this file up:
>>
>> I can do this
>> x <-
>> read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269",
>> skip=26)
>>
>> but If I add another gauge to this
>>
>> x <-
>> read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500",
>> skip=26)
>> It does not work because there are two files appended to each other.
>>
>> It would be easy enough to write the code so that each individual
>> gauge would be read in as a different file, but is there a way to get
>> this information in using the commented part of the file to give the
>> headers?  This is probably a job for some other programing language
>> like perl, but I don't know perl.
>>
>> any help would be very helpful.
>
> I'm completely clear what's going on here --- (a) I'm not sure what you mean
> by
> ``using the commented part of the file to give the headers''; the headers
> are not ``commented'', and (b) I'm puzzled by the fact that there are 9
> column
> headers/field names, but only 7 columns/fields.
>
> Be that as it were, here's what I'd do:
>
> x <-
> read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269",
>                 skip=23,nrows=1,header=TRUE,check.names=FALSE)
> y <-
> read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269",skip=26)
> names(y) <- names(x)[1:7]
>
> This ***appears*** to give a reasonably sensible data frame.
>
> Is this anything like what you want?
>
>        cheers,
>
>                Rolf Turner
>
> ######################################################################
> Attention:This e-mail message is privileged and confidential. If you are not
> theintended recipient please delete the message and notify the sender.Any
> views or opinions presented are solely those of the author.
>
> This e-mail has been scanned and cleared by
> MailMarshalwww.marshalsoftware.com
> ######################################################################
>

-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

								-K. Mullis