[R] Creating a year-month indicator and groupby with category

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Mon Oct 3 09:45:40 CEST 2022


Hi Tariq,
There were a couple of glitches in your data structure. Here's an
example of a simple plot:

dat<-structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
"ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
"FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA,
-37L), spec = structure(list(cols = list(year = structure(list(), class =
c("collector_double",
"collector")), month = structure(list(), class = c("collector_double",
"collector")), company = structure(list(), class = c("collector_character",
"collector")), share = structure(list(), class = c("collector_double",
"collector")), com_name = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
# convert year and month fields to dates about the middle of each month
dat$date<-as.Date(paste(dat$year,dat$month,15,sep="-"),"%Y-%m-%d")
# plot the values for one company
plot(dat$date[dat$company=="ABC"],dat$share[dat$company=="ABC"],
 main="Plot of dat",xlab="Year",ylab="Share",
 xlim=range(dat$date),ylim=range(dat$share),
 type="l",col="red")
# add a line for the other one
lines(dat$date[dat$company=="FGH"],dat$share[dat$company=="FGH"],col="green")
# get the x plot limits as they are date values
xspan<-par("usr")[1:2]
# place a legend about in the middle of the plot
legend(xspan[1]+diff(xspan)*0.3,35,c("ABC","FGH"),lty=1,col=c("red","green"))

There are many more elegant ways to plot something like this.

Jim

On Mon, Oct 3, 2022 at 10:05 AM Tariq Khasiri <tariqkhasiri using gmail.com> wrote:
>
> Hello,
>
> I have the following data. I want to show in a line plot how each different
> company is earning over the timeline of my data sample.
>
> I'm not sure how I can create the *year-month indicator* to plot it nicely
> in my horizontal axis out of my dataset.
>
> After creating the *year-month* indicator ( which will be in my x axis) I
> want to create a dataframe where I can groupby companies over the
> year-month indicator by putting *share *in the y axis as variables.
>
> ### data is like the following
>
> dput(dat)
> structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
> 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
> 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
> 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
> 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
> "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
> ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
> 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
> 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA,
> -37L), spec = structure(list(cols = list(year = structure(list(), class =
> c("collector_double",
> "collector")), month = structure(list(), class = c("collector_double",
> "collector")), company = structure(list(), class = c("collector_character",
> "collector")), share = structure(list(), class = c("collector_double",
> "collector")), com_name = structure(list(), class = c("collector_double",
> "collector"))), default = structure(list(), class = c("collector_guess",
> "collector")), delim = ","), class = "col_spec"), problems = <pointer:
> 0x7fd732028680>, class = c("spec_tbl_df",
> "tbl_df", "tbl", "data.frame"))
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list