[R] Correctly applying aggregate.ts() [RESOLVED]

Sat Sep 8 00:39:36 CEST 2018

On Fri, 7 Sep 2018, Bert Gunter wrote:

> Well, let's see:
> "monthly.rain <- aggregate.ts(x = dp['sampdate','prcp'], by = list(month = \
> substr(dp$sampdate, 1, 7)), FUN = sum, na.rm = TRUE)"
>
> 1. x is a data frame, so why are you using the time series method?
> Perhaps you need to study S3 method usage in R.

Bert,

   I saw the four varieties of aggregate and thought the time series
appropriate for the data frame of sequential dates. As I wrote, I had
difficulties understanding which flavor to use.

> 2. You have improperly subscripted the data frame: it should be dp[,
> c('sampdate','prcp')] . Perhaps you need to read about how
> subscripting in R. However, in this case, no subscripting is needed
> (see 3.)

   Ah so. All the examples I saw used single column data frames.

> 3. As you should be using the data frame method, and the month is
> obtained as a substring of sampdate, you should use dp[,'prcp'] as
> your data frame so that sum() is not applied to the sampdate column.
>
> 4. I assume the "\" indicates <Return> ?

   Yes. Alpine broke the line so I added a newline to the first part.

> Anyway, once you have corrected all that, here's the call:
>
>> monthly.rain <- aggregate(dp[, 'prcp'],
> +                           list(substr(dp$sampdate,1,7)),
> +                           FUN = sum, na.rm = TRUE)

   Thanks for making the syntax so clear.

> It's perhaps also worth noting that the formula method (for data
> frames) is somewhat more convenient, especially with several grouping
> factors in the list:
>
>> monthly.rain <- aggregate(prcp ~ substr(sampdate,1,7), data = dp, FUN = sum, na.rm = TRUE)
>> ##yielding
>> monthly.rain
>  substr(sampdate, 1, 7) prcp
> 1                2005-01 4.88
> 2                2005-02 2.27
> 3                2005-03 0.06

   I looked at the formula method without appreciating how to apply it.

   Now I can work with the multiple of daily data sets I have and properly
condense them for presentation to readers of the report. And I'm much better
armed to understand how to apply aggregate() to various data sets.

Very much appreciated,

Rich
a