[R] Creating a year-month indicator and groupby with category

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Oct 3 07:47:41 CEST 2022


Hello,

First of all, I'll repost the data at end because the OP posted with a 
pointer ref:

problems = <pointer: 0x7fd732028680>

and this must be removed for the dput output to run.
Suggestion: coerce to class "data.frame" and post the output of


dput(as.data.frame(dat))


Now the plot.
Here are two plots of share by date, grouped by company. One with base R 
graphics and the other one with package ggplot2.

Create a date/time column to be used by both plots.


dat$date <- with(dat, ISOdate(year, month, 1))


1) Base R plot.


ylim <- range(dat$share) + c(0, 2)  # make room for the legend on top
comp <- unique(dat$company)         # draw each line in a loop on companies

# open a blank plot witth all the data,
# setting the ylim as explained above
plot(share ~ date, dat, type = "n", ylim = ylim)
for(i in seq_along(comp)) {
   lines(share ~ date, subset(dat, company == comp[i]), col = i)
}
legend("top", legend = comp, col = seq_along(comp), lty = "solid", horiz 
= TRUE)


2) ggplot2 plot.


library(ggplot2)

ggplot(dat, aes(date, share, color = company)) +
   geom_line() +
   scale_x_datetime(date_labels = "%Y-%m") +
   scale_color_manual(values = c(ABC = "black", FGH = "red")) +
   theme_bw()


3) The data, reposted with the new pipe operator introduced in R 4.1.0 
to make it look modern and slightly edited.


dat |> as.data.frame() |> dput()
dat <-
structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
"ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), date = 
structure(c(1543665600,
1546344000, 1549022400, 1551441600, 1554120000, 1556712000, 1559390400,
1561982400, 1483272000, 1485950400, 1488369600, 1491048000, 1493640000,
1496318400, 1498910400, 1501588800, 1504267200, 1506859200, 1509537600,
1512129600, 1514808000, 1517486400, 1519905600, 1522584000, 1525176000,
1527854400, 1530446400, 1533124800, 1535803200, 1538395200, 1541073600,
1543665600, 1546344000, 1549022400, 1551441600, 1554120000, 1556712000
), class = c("POSIXct", "POSIXt"), tzone = "GMT")),
row.names = c(NA, -37L), class = "data.frame")


Hope this helps,

Rui Barradas

Às 00:04 de 03/10/2022, Tariq Khasiri escreveu:
> Hello,
> 
> I have the following data. I want to show in a line plot how each different
> company is earning over the timeline of my data sample.
> 
> I'm not sure how I can create the *year-month indicator* to plot it nicely
> in my horizontal axis out of my dataset.
> 
> After creating the *year-month* indicator ( which will be in my x axis) I
> want to create a dataframe where I can groupby companies over the
> year-month indicator by putting *share *in the y axis as variables.
> 
> ### data is like the following
> 
> dput(dat)
> structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
> 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
> 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
> 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
> 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
> "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
> ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
> 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
> 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA,
> -37L), spec = structure(list(cols = list(year = structure(list(), class =
> c("collector_double",
> "collector")), month = structure(list(), class = c("collector_double",
> "collector")), company = structure(list(), class = c("collector_character",
> "collector")), share = structure(list(), class = c("collector_double",
> "collector")), com_name = structure(list(), class = c("collector_double",
> "collector"))), default = structure(list(), class = c("collector_guess",
> "collector")), delim = ","), class = "col_spec"), problems = <pointer:
> 0x7fd732028680>, class = c("spec_tbl_df",
> "tbl_df", "tbl", "data.frame"))
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list