# [R] Calculate daily means from 5-minute interval data

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Mon Aug 30 04:47:20 CEST 2021

```IMO assuming periodicity is a bad practice for this. Missing timestamps happen too, and there is no reason to build a broken analysis process.

On August 29, 2021 7:09:01 PM PDT, Richard O'Keefe <raoknz using gmail.com> wrote:
>Why would you need a package for this?
>> samples.per.day <- 12*24
>
>That's 12 5-minute intervals per hour and 24 hours per day.
>Generate some fake data.
>
>> x <- rnorm(samples.per.day * 365)
>> length(x)
>[1] 105120
>
>Reshape the fake data into a matrix where each row represents one
>24-hour period.
>
>> m <- matrix(x, ncol=samples.per.day, byrow=TRUE)
>
>Now we can summarise the rows any way we want.
>The basic tool here is ?apply.
>?rowMeans is said to be faster than using apply to calculate means,
>so we'll use that.  There is no *rowSds so we have to use apply
>for the standard deviation.  I use ?head because I don't want to
>post tens of thousands of meaningless numbers.
>
>[1] -0.03510177  0.11817337  0.06725203 -0.03578195 -0.02448077 -0.03033692
>[1] 1.0017718 0.9922920 1.0100550 0.9956810 1.0077477 0.9833144
>
>Now whether this is a *sensible* way to summarise your flow data is a question
>that a hydrologist would be better placed to answer.  I would have started with
>> plot(density(x))
>which I just did with some real river data (only a month of it, sigh).
>Very long tail.
>Even
>> plot(density(log(r)))
>shows a very long tail.  Time to plot the data against time.  Oh my!
>All of the long tail came from a single event.
>There's a period of low flow, then there's a big rainstorm and the
>flow goes WAY up, then over about two days the flow subsides to a new
>somewhat higher level.
>
>None of this is reflected in means or standard deviations.
>This is *time series* data, and time series data of a fairly special kind.
>
>One thing that might be helpful with your data would simply be
>> image(log(m))
>For my one month sample, the spike showed up very clearly that way.
>Because right now, your first task is to get an idea of what the data
>look like, and means-and-standard-deviations won't really do that.
>
>Oh heck, here's another reason to go with image(log(m)).
>With image(m) I just see the one big spike.
>With image(log(m)), I can see that little spikes often start in the
>afternoon of one day and continue into the morning of the next.
>From daily means, it looks like two unusual, but not very
>unusual, days.  From the image, it's clearly ONE rainfall event
>that just happens to straddle a day boundary.
>
>This is all very basic stuff, which is really the point.  You want to use
>elementary tools to look at the data before you reach for fancy ones.
>
>
>On Mon, 30 Aug 2021 at 03:09, Rich Shepard <rshepard using appl-ecosys.com> wrote:
>>
>> I have a year's hydraulic data (discharge, stage height, velocity, etc.)
>> from a USGS monitoring gauge recording values every 5 minutes. The data
>> files contain 90K-93K lines and plotting all these data would produce a
>> solid block of color.
>>
>> What I want are the daily means and standard deviation from these data.
>>
>> As an occasional R user (depending on project needs) I've no idea what
>> packages could be applied to these data frames. There likely are multiple
>> paths to extracting these daily values so summary statistics can be
>> calculated and plotted. I'd appreciate suggestions on where to start to
>> learn how I can do this.
>>
>> TIA,
>>
>> Rich
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help