[R] Sum data according to date in sequence

roslinazairimah zakaria ro@||n@ump @end|ng |rom gm@||@com
Fri Nov 3 21:21:40 CET 2023


Hi Jim,

Yes, that is exactly what I am trying to do. Once I get that, I want to
plot time series data.
Thank you very much Jim.

On Fri, Nov 3, 2023 at 11:58 PM jim holtman <jholtman using gmail.com> wrote:

> Is this what you are after?
>
> library(tidyverse)
>
>
> library(lubridate)
>
> input <- structure(list(StationName = c("PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
>   "PALO ALTO CA / CAMBRIDGE #1"), date = c("1/14/2016", "1/14/2016",
>    "1/14/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/15/2016",
>    "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016",
>    "1/16/2016", "1/16/2016", "1/17/2016", "1/17/2016", "1/17/2016",
>    "1/17/2016", "1/17/2016", "1/18/2016"), time = c("12:09", "19:50",
>   "20:22", "8:25", "14:23", "18:17", "21:46", "10:19", "12:12",
>   "14:12", "16:22", "19:16", "19:19", "20:24", "9:54", "12:16",
>   "13:53", "19:03", "22:00", "8:58"),
>   EnergykWh = c(4.680496, 6.272414,
>   1.032782, 11.004884, 10.096824, 6.658797, 4.808874, 1.469384,
>   2.996239, 0.303222, 4.988339, 8.131804, 0.117156, 3.285669, 1.175608,
>   3.677487, 1.068393, 8.820755, 8.138583, 9.0575)),
>   row.names = c(NA, 20L), class = "data.frame")
> # convert date from character to Date
> byDate <- input |>
>   mutate(newdate = mdy(date)) |>
>   group_by(newdate) |>
>   summarise(total = sum(EnergykWh))
>
> byDate
>
> ## # A tibble: 5 × 2
> ##   newdate    total
> ##   <date>     <dbl>
> ## 1 2016-01-14 12.0
> ## 2 2016-01-15 32.6
> ## 3 2016-01-16 21.3
> ## 4 2016-01-17 22.9
> ## 5 2016-01-18  9.06
>
>
> Thanks
>
> Jim Holtman
> *Data Munger Guru*
>
>
> *What is the problem that you are trying to solve?Tell me what you want to
> do, not how you want to do it.*
>
>
> On Fri, Nov 3, 2023 at 2:51 AM roslinazairimah zakaria <
> roslinaump using gmail.com> wrote:
>
>> Hi,
>> I tried this:
>> # extract date from the time stamp
>> dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh)
>> head(dt1)
>> colnames(dt1) <- c("date", "EnergykWh")
>> and
>> my dt1 becomes these, the dates are replace by numbers.
>>
>> dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh)
>> dput(head(dt1))
>> colnames(dt1) <- c("date", "EnergykWh")
>> dput(head(dt1))
>>
>>
>> > dput(head(dt1))structure(c(16814, 16814, 16814, 16815, 16815, 16815,
>> 4.680496,
>> 6.272414, 1.032782, 11.004884, 10.096824, 6.658797), dim = c(6L,
>> 2L), dimnames = list(NULL, c("date", "EnergykWh")))
>>
>> Then I tried this:
>> library(dplyr)
>> dt1 %>%
>>   group_by(date) %>%
>>   summarise(EnergykWh.sum = sum(EnergykWh))
>> and got this errors
>>
>> dt1 %>%+   group_by(date) %>%+   summarise(EnergykWh.sum =
>> sum(EnergykWh))Error in UseMethod("group_by") :
>>   no applicable method for 'group_by' applied to an object of class
>> "c('matrix', 'array', 'double', 'numeric')"
>>
>>
>>
>> On Fri, Nov 3, 2023 at 7:23 AM roslinazairimah zakaria <
>> roslinaump using gmail.com>
>> wrote:
>>
>> > Dear all,
>> >
>> > I have this set of data. I would like to sum the EnergykWh according
>> date
>> > sequences.
>> >
>> > > head(dt1,20)                   StationName      date  time EnergykWh
>> > 1  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09  4.680496
>> > 2  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50  6.272414
>> > 3  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22  1.032782
>> > 4  PALO ALTO CA / CAMBRIDGE #1 1/15/2016  8:25 11.004884
>> > 5  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824
>> > 6  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17  6.658797
>> > 7  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46  4.808874
>> > 8  PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19  1.469384
>> > 9  PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12  2.996239
>> > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12  0.303222
>> > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22  4.988339
>> > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16  8.131804
>> > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19  0.117156
>> > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24  3.285669
>> > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016  9:54  1.175608
>> > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16  3.677487
>> > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53  1.068393
>> > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03  8.820755
>> > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00  8.138583
>> > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016  8:58  9.057500
>> >
>> > I have tried this:
>> > library(dplyr)
>> > sums <- dt1 %>%
>> >   group_by(date) %>%
>> >   summarise(EnergykWh = sum(EnergykWh))
>> >
>> > head(sums,20)
>> >
>> > The date is not by daily sequence but by year sequence.
>> >
>> > > head(sums,20)# A tibble: 20 × 2
>> >    date      EnergykWh
>> >    <chr>         <dbl> 1 1/1/2017     25.3   2 1/1/2018     61.0   3
>> 1/1/2019      0.627 4 1/1/2020     10.7   5 1/10/2017    69.4   6
>> 1/10/2018    54.5   7 1/10/2019    49.1   8 1/10/2020    45.9   9
>> 1/11/2017    73.9  10 1/11/2018    53.3  11 1/11/2019    93.5  12
>> 1/11/2020    66.7  13 1/12/2017    78.6  14 1/12/2018    42.2  15
>> 1/12/2019    22.7  16 1/12/2020    80.9  17 1/13/2017    85.6  18
>> 1/13/2018    46.4  19 1/13/2019    40.0  20 1/13/2020   121.
>> >
>> >
>> >
>> > Thank you very much for any help given.
>> >
>> >
>> > --
>> > *Roslinazairimah Zakaria*
>> > *Tel: +609-5492370; Fax. No.+609-5492766*
>> >
>> > *Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
>> > roslinaump using gmail.com <roslinaump using gmail.com>*
>> > Faculty of Industrial Sciences & Technology
>> > University Malaysia Pahang
>> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>> >
>>
>>
>> --
>> *Roslinazairimah Zakaria*
>> *Tel: +609-5492370; Fax. No.+609-5492766*
>>
>> *Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
>> roslinaump using gmail.com <roslinaump using gmail.com>*
>> Faculty of Industrial Sciences & Technology
>> University Malaysia Pahang
>> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

-- 
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
roslinaump using gmail.com <roslinaump using gmail.com>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

	[[alternative HTML version deleted]]



More information about the R-help mailing list