[R] Including percentage values inside columns of a histogram

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Tue Aug 17 00:57:06 CEST 2021


Hi Paul,
I just worked out your first request:

datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
"factor"),
    Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
    15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
    15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
    15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
    15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
    16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
    15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
    15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
    15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
"data.frame")
histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
 breaks="Sturges", col="darkgray"))
library(plotrix)
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
barlabels(histval$mids,histval$counts,histpcts)

I think that's what you asked for:

Jim

On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulbernal07 using gmail.com> wrote:
>
> This is way better, now, how could I put the frequency labels in the
> columns as a percentage, instead of presenting them as counts?
>
> Thank you so much.
>
> Paul
>
> El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas using sapo.pt>)
> escribió:
>
> > Hello,
> >
> > You forgot to cc the list.
> >
> > Here are two ways, both of them apply hist() and text() to Amount split
> > by Date. The return value of hist is saved because it's a list with
> > members the histogram's bars midpoints and the counts. Those are used to
> > know where to put the text labels.
> > A vector lbls is created to get rid of counts of zero.
> >
> > The main difference between the two ways is the histogram's titles.
> >
> >
> > old_par <- par(mfrow = c(1, 3))
> > h_list <- with(datasetregs, tapply(Amount, Date, function(x){
> >    h <- hist(x)
> >    lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
> >    text(h$mids, h$counts/2, labels = lbls)
> > }))
> > par(old_par)
> >
> >
> >
> > old_par <- par(mfrow = c(1, 3))
> > sp <- split(datasetregs, datasetregs$Date)
> > h_list <- lapply(seq_along(sp), function(i){
> >    hist_title <- paste("Histogram of", names(sp)[i])
> >    h <- hist(sp[[i]]$Amount, main = hist_title)
> >    lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
> >    text(h$mids, h$counts/2, labels = lbls)
> > })
> > par(old_par)
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Às 23:16 de 16/08/21, Paul Bernal escreveu:
> > > Dear Rui,
> > >
> > > The hist() function comes from the graphics package, from what I could
> > > see. The thing is that I want to divide the Amount column into several
> > > bins and then generate three different histograms, one for each AF
> > > period (AF refers to fiscal years). As you can see, the data contains
> > > three fiscal years (2017, 2020 and 2021). I want to see the percentage
> > > of cases that fall into different amount categories, from 15,000 and
> > > below, 16,000 to 17,000, from 18,000 to 19,000, and so on.
> > >
> > > Thanks for your kind help.
> > >
> > > Paul
> > >
> > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (<ruipbarradas using sapo.pt
> > > <mailto:ruipbarradas using sapo.pt>>) escribió:
> > >
> > >     Hello,
> > >
> > >     The function Hist comes from what package?
> > >
> > >     Are you sure you don't want a bar plot?
> > >
> > >
> > >     agg <- aggregate(Amount ~ Date, datasetregs, sum)
> > >     bp <- barplot(Amount ~ Date, agg)
> > >     with(agg, text(bp, Amount/2, labels = Amount))
> > >
> > >
> > >     Hope this helps,
> > >
> > >     Rui Barradas
> > >
> > >     Às 22:54 de 16/08/21, Paul Bernal escreveu:
> > >      > Hello everyone,
> > >      >
> > >      > I am currently working with R version 4.1.0 and I am trying to
> > >     include
> > >      > (inside the columns of the histogram), the percentage
> > >     distribution and I
> > >      > want to generate three histograms, one for each fiscal year (in
> > >     the Date
> > >      > column, there are three fiscal year AF 2017, AF 2020 and AF
> > >     2021). However,
> > >      > I can´t seem to accomplish this.
> > >      >
> > >      > Here is my data:
> > >      >
> > >      > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
> > >      > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > >      > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > >      > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > >      > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > >      > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
> > >      > "factor"),
> > >      >      Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
> > >      >      15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100,
> > 15000,
> > >      >      15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000,
> > 15000,
> > >      >      15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
> > 15000,
> > >      >      15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000,
> > 15000,
> > >      >      16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000,
> > 15000,
> > >      >      15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000,
> > 15000,
> > >      >      15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000,
> > 15000,
> > >      >      15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class
> > =
> > >      > "data.frame")
> > >      >
> > >      > I would like to modify the following script:
> > >      >
> > >      >> with(datasetregs, Hist(Amount, groups=Date, scale="frequency",
> > >      > +   breaks="Sturges", col="darkgray"))
> > >      >
> > >      > #The only thing missing here are the percentages corresponding to
> > >     each bin
> > >      > (I would like to see the percentages inside each column, or on
> > >     top outside
> > >      > if possible)
> > >      >
> > >      > Any help will be greatly appreciated.
> > >      >
> > >      > Best regards,
> > >      >
> > >      > Paul.
> > >      >
> > >      >       [[alternative HTML version deleted]]
> > >      >
> > >      > ______________________________________________
> > >      > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
> > >     -- To UNSUBSCRIBE and more, see
> > >      > https://stat.ethz.ch/mailman/listinfo/r-help
> > >     <https://stat.ethz.ch/mailman/listinfo/r-help>
> > >      > PLEASE do read the posting guide
> > >     http://www.R-project.org/posting-guide.html
> > >     <http://www.R-project.org/posting-guide.html>
> > >      > and provide commented, minimal, self-contained, reproducible code.
> > >      >
> > >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list