[R] Creating a year-month indicator and groupby with category

Tue Oct 4 08:23:29 CEST 2022

Hello,

The error comes from having 2 colors and 4 labels in

scale_colour_manual("", values = c(f_cols[1], m_cols[1]), labels = 
c("F", "M" , "C" , "G"))

You need the same number of colors and labels. For instance, though 
they're probably not the colors you want the following vector has 4 
colors and the plot code doesn't give an error.

values = c(f_cols[1], m_cols[1], f_cols[2], m_cols[2])

Besides, why are you defining 2 colors vectors with 6 colors each if you 
only have 4 companies?

Hope this helps,

Rui Barradas

Às 23:30 de 03/10/2022, Tariq Khasiri escreveu:
> Thanks everyone for being so kind and patient with me throughout the
> process! Mr. Barradas and Mr. Lemon, very generous of you for taking the
> time and patience to go over my code and data , and taking the time to give
> me meaningful feedback!
> 
> With your help and suggestion, I was successful in making a graph from my
> data. In my main data I have four companies, and just making the graph
> process a little more advanced. However after writing the command , I get
> the error that I have 4 values but gave only 2 values. Would anyone kindly
> guide me what's the mistake and how I can rectify this?
> 
> Data is the same. But my main data has 4 companies whereas in R project I
> just gave 2 only for convenience.
> 
> #### Code needed to execute it #########
> 
> library(tidyverse)
> library(showtext)
> library(usefunc)
> library(patchwork)
> library(cowplot)
> library(rcartocolor)
> library(zoo)
> 
> # load fonts
> font_add_google(name = "Bungee Shade", family = "bungee")
> font_add_google(name = "Dosis", family = "dosis")
> showtext_auto()
> 
> # set colours
> f_cols = c("#008080", "#329999", "#66b2b2",
>             "#7fbfbf", "#99cccc", "#cce5e5")
> m_cols = c("#4b0082", "#6e329b", "#9366b4",
>             "#a57fc0", "#b799cd", "#dbcce6")
> 
> dat$YearMonth <- as.yearmon(paste(dat$year, " ", dat$month), "%Y %m")
> 
> # plot of share of companies per year
> p1 <- ggplot(data = dat,
>               mapping = aes(x = YearMonth, y = share, colour = company)) +
>    geom_line() +
>    geom_point(size = 1) +
>    scale_colour_manual("", values = c(f_cols[1], m_cols[1]), labels = c("F",
> "M" , "C" , "G")) +
>    scale_y_continuous(limits = c(0, 80)) +
>    coord_cartesian(expand = F) +
>    labs(x = "Year",
>         y = "Share of Companies") +
>    theme(legend.position = c(0.1, 0.9),
>          legend.title = element_blank(),
>          legend.text = element_text(family = "dosis", size = 14),
>          panel.background = element_rect(fill = "#FAFAFA", colour =
> "#FAFAFA"),
>          plot.background = element_rect(fill = "#FAFAFA", colour =
> "#FAFAFA"),
>          legend.background = element_rect(fill = "transparent", colour =
> "transparent"),
>          legend.key = element_rect(fill = "transparent", colour =
> "transparent"),
>          axis.title.y = element_text(margin = margin(0, 20, 0, 0), family =
> "dosis"),
>          axis.text = element_text(family = "dosis"),
>          plot.margin = unit(c(0.5, 0.8, 0.5, 0.5), "cm"),
>          panel.grid.major = element_line(colour = "#DEDEDE"),
>          panel.grid.minor = element_blank())
> p1
> 
> The error is saying :
> 
> └─ggplot2 (local) FUN(X[[i]], ...)
>    7.         ├─base::unlist(...)
>    8.         └─base::lapply(scales$scales, function(scale) scale$map_df(df
> = df))
>    9.           └─ggplot2 (local) FUN(X[[i]], ...)
>   10.             └─scale$map_df(df = df)
>   11.               └─ggplot2 (local) f(..., self = self)
>   12.                 └─base::lapply(aesthetics, function(j)
> self$map(df[[j]]))
>   13.                   └─ggplot2 (local) FUN(X[[i]], ...)
>   14.                     └─self$map(df[[j]])
>   15.                       └─ggplot2 (local) f(..., self = self)
>   16.                         └─self$palette(n)
>   17.                           └─ggplot2 (local) f(...)
>   18.                             └─rlang::abort(glue("Insufficient values
> in manual scale. {n} needed but only {length(values)} provided."))
> 
> On Mon, 3 Oct 2022 at 02:45, Jim Lemon <drjimlemon using gmail.com> wrote:
> 
>> Hi Tariq,
>> There were a couple of glitches in your data structure. Here's an
>> example of a simple plot:
>>
>> dat<-structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
>> 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
>> 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
>> 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
>> 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
>> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
>> "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
>> ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
>> 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
>> 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
>> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA,
>> -37L), spec = structure(list(cols = list(year = structure(list(), class =
>> c("collector_double",
>> "collector")), month = structure(list(), class = c("collector_double",
>> "collector")), company = structure(list(), class = c("collector_character",
>> "collector")), share = structure(list(), class = c("collector_double",
>> "collector")), com_name = structure(list(), class = c("collector_double",
>> "collector"))), default = structure(list(), class = c("collector_guess",
>> "collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
>> "tbl_df", "tbl", "data.frame"))
>> # convert year and month fields to dates about the middle of each month
>> dat$date<-as.Date(paste(dat$year,dat$month,15,sep="-"),"%Y-%m-%d")
>> # plot the values for one company
>> plot(dat$date[dat$company=="ABC"],dat$share[dat$company=="ABC"],
>>   main="Plot of dat",xlab="Year",ylab="Share",
>>   xlim=range(dat$date),ylim=range(dat$share),
>>   type="l",col="red")
>> # add a line for the other one
>>
>> lines(dat$date[dat$company=="FGH"],dat$share[dat$company=="FGH"],col="green")
>> # get the x plot limits as they are date values
>> xspan<-par("usr")[1:2]
>> # place a legend about in the middle of the plot
>>
>> legend(xspan[1]+diff(xspan)*0.3,35,c("ABC","FGH"),lty=1,col=c("red","green"))
>>
>> There are many more elegant ways to plot something like this.
>>
>> Jim
>>
>> On Mon, Oct 3, 2022 at 10:05 AM Tariq Khasiri <tariqkhasiri using gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I have the following data. I want to show in a line plot how each
>> different
>>> company is earning over the timeline of my data sample.
>>>
>>> I'm not sure how I can create the *year-month indicator* to plot it
>> nicely
>>> in my horizontal axis out of my dataset.
>>>
>>> After creating the *year-month* indicator ( which will be in my x axis) I
>>> want to create a dataframe where I can groupby companies over the
>>> year-month indicator by putting *share *in the y axis as variables.
>>>
>>> ### data is like the following
>>>
>>> dput(dat)
>>> structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019,
>>> 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
>>> 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
>>> 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12,
>>> 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1,
>>> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC",
>>> "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH",
>>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
>>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH",
>>> "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH"
>>> ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55,
>>> 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53,
>>> 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1,
>>> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA,
>>> -37L), spec = structure(list(cols = list(year = structure(list(), class =
>>> c("collector_double",
>>> "collector")), month = structure(list(), class = c("collector_double",
>>> "collector")), company = structure(list(), class =
>> c("collector_character",
>>> "collector")), share = structure(list(), class = c("collector_double",
>>> "collector")), com_name = structure(list(), class = c("collector_double",
>>> "collector"))), default = structure(list(), class = c("collector_guess",
>>> "collector")), delim = ","), class = "col_spec"), problems = <pointer:
>>> 0x7fd732028680>, class = c("spec_tbl_df",
>>> "tbl_df", "tbl", "data.frame"))
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.