[R] Including percentage values inside columns of a histogram

Paul Bernal p@u|bern@|07 @end|ng |rom gm@||@com
Tue Aug 17 03:04:20 CEST 2021


Thank you very much Mr. Gunter, I will give it a try.

Cheers,

Paul

El lun., 16 de agosto de 2021 7:49 p. m., Bert Gunter <
bgunter.4567 using gmail.com> escribió:

> I may well misunderstand, but proffered solutions seem more complicated
> than necessary.
> Note that the return of hist() can be saved as a list of class "histogram"
> and then plotted with  plot.histogram(), which already has a "labels"
> argument that seems to be what you want. A simple example is"
>
> dat <- runif(50, 0, 10)
> myhist <- hist(dat, freq = TRUE, breaks ="Sturges")
>
> plot(myhist, col = "darkgray",
>      labels = as.character(round(myhist$density*100,1) ),
>      ylim = c(0, 1.1*max(myhist$counts)))
> ## note that this is plot.histogram because myhist has class "histogram"
>
> Note that I expanded the y axis a bit to be sure to include the labels.
> You can, of course, plot your separate years as Rui has indicated or via
> e.g. ?layout.
>
> Apologies if I have misunderstood. Just ignore this in that case.
> Otherwise, I leave it to you to fill in details.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal <paulbernal07 using gmail.com>
> wrote:
>
>> Dear Jim,
>>
>> Thank you so much for your kind reply. Yes, this is what I am looking for,
>> however, can´t see clearly how the bars correspond to the bins in the
>> x-axis. Maybe there is a way to align the amounts so that they match the
>> columns, sorry if I sound picky, but just want to learn if there is a way
>> to accomplish this.
>>
>> Best regards,
>>
>> Paul
>>
>> El lun, 16 ago 2021 a las 17:57, Jim Lemon (<drjimlemon using gmail.com>)
>> escribió:
>>
>> > Hi Paul,
>> > I just worked out your first request:
>> >
>> > datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
>> > 2L,
>> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
>> > "factor"),
>> >     Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
>> >     15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
>> >     15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
>> >     15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
>> >     15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
>> >     16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
>> >     15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
>> >     15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
>> >     15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
>> > "data.frame")
>> > histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
>> >  breaks="Sturges", col="darkgray"))
>> > library(plotrix)
>> > histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
>> > barlabels(histval$mids,histval$counts,histpcts)
>> >
>> > I think that's what you asked for:
>> >
>> > Jim
>> >
>> > On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulbernal07 using gmail.com>
>> > wrote:
>> > >
>> > > This is way better, now, how could I put the frequency labels in the
>> > > columns as a percentage, instead of presenting them as counts?
>> > >
>> > > Thank you so much.
>> > >
>> > > Paul
>> > >
>> > > El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas using sapo.pt
>> >)
>> > > escribió:
>> > >
>> > > > Hello,
>> > > >
>> > > > You forgot to cc the list.
>> > > >
>> > > > Here are two ways, both of them apply hist() and text() to Amount
>> split
>> > > > by Date. The return value of hist is saved because it's a list with
>> > > > members the histogram's bars midpoints and the counts. Those are
>> used
>> > to
>> > > > know where to put the text labels.
>> > > > A vector lbls is created to get rid of counts of zero.
>> > > >
>> > > > The main difference between the two ways is the histogram's titles.
>> > > >
>> > > >
>> > > > old_par <- par(mfrow = c(1, 3))
>> > > > h_list <- with(datasetregs, tapply(Amount, Date, function(x){
>> > > >    h <- hist(x)
>> > > >    lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
>> > > >    text(h$mids, h$counts/2, labels = lbls)
>> > > > }))
>> > > > par(old_par)
>> > > >
>> > > >
>> > > >
>> > > > old_par <- par(mfrow = c(1, 3))
>> > > > sp <- split(datasetregs, datasetregs$Date)
>> > > > h_list <- lapply(seq_along(sp), function(i){
>> > > >    hist_title <- paste("Histogram of", names(sp)[i])
>> > > >    h <- hist(sp[[i]]$Amount, main = hist_title)
>> > > >    lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
>> > > >    text(h$mids, h$counts/2, labels = lbls)
>> > > > })
>> > > > par(old_par)
>> > > >
>> > > >
>> > > > Hope this helps,
>> > > >
>> > > > Rui Barradas
>> > > >
>> > > > Às 23:16 de 16/08/21, Paul Bernal escreveu:
>> > > > > Dear Rui,
>> > > > >
>> > > > > The hist() function comes from the graphics package, from what I
>> > could
>> > > > > see. The thing is that I want to divide the Amount column into
>> > several
>> > > > > bins and then generate three different histograms, one for each AF
>> > > > > period (AF refers to fiscal years). As you can see, the data
>> contains
>> > > > > three fiscal years (2017, 2020 and 2021). I want to see the
>> > percentage
>> > > > > of cases that fall into different amount categories, from 15,000
>> and
>> > > > > below, 16,000 to 17,000, from 18,000 to 19,000, and so on.
>> > > > >
>> > > > > Thanks for your kind help.
>> > > > >
>> > > > > Paul
>> > > > >
>> > > > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (<
>> ruipbarradas using sapo.pt
>> > > > > <mailto:ruipbarradas using sapo.pt>>) escribió:
>> > > > >
>> > > > >     Hello,
>> > > > >
>> > > > >     The function Hist comes from what package?
>> > > > >
>> > > > >     Are you sure you don't want a bar plot?
>> > > > >
>> > > > >
>> > > > >     agg <- aggregate(Amount ~ Date, datasetregs, sum)
>> > > > >     bp <- barplot(Amount ~ Date, agg)
>> > > > >     with(agg, text(bp, Amount/2, labels = Amount))
>> > > > >
>> > > > >
>> > > > >     Hope this helps,
>> > > > >
>> > > > >     Rui Barradas
>> > > > >
>> > > > >     Às 22:54 de 16/08/21, Paul Bernal escreveu:
>> > > > >      > Hello everyone,
>> > > > >      >
>> > > > >      > I am currently working with R version 4.1.0 and I am
>> trying to
>> > > > >     include
>> > > > >      > (inside the columns of the histogram), the percentage
>> > > > >     distribution and I
>> > > > >      > want to generate three histograms, one for each fiscal year
>> > (in
>> > > > >     the Date
>> > > > >      > column, there are three fiscal year AF 2017, AF 2020 and AF
>> > > > >     2021). However,
>> > > > >      > I can´t seem to accomplish this.
>> > > > >      >
>> > > > >      > Here is my data:
>> > > > >      >
>> > > > >      > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
>> 2L,
>> > > > >      > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> > 2L,
>> > > > >      > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> > 2L,
>> > > > >      > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> > 3L,
>> > > > >      > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> > 3L,
>> > > > >      > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"),
>> > class =
>> > > > >      > "factor"),
>> > > > >      >      Amount = c(40100, 101100, 35000, 40100, 15000, 45100,
>> > 40200,
>> > > > >      >      15000, 35000, 35100, 20300, 40100, 15000, 67100,
>> 17100,
>> > > > 15000,
>> > > > >      >      15000, 50100, 35100, 15000, 15000, 15000, 15000,
>> 15000,
>> > > > 15000,
>> > > > >      >      15000, 15000, 15000, 15000, 15000, 15000, 15000,
>> 15000,
>> > > > 15000,
>> > > > >      >      15000, 15000, 20100, 15000, 15000, 15000, 15000,
>> 15000,
>> > > > 15000,
>> > > > >      >      16600, 15000, 15000, 15700, 15000, 15000, 15000,
>> 15000,
>> > > > 15000,
>> > > > >      >      15000, 15000, 15000, 15000, 20200, 21400, 25100,
>> 15000,
>> > > > 15000,
>> > > > >      >      15000, 15000, 15000, 15000, 25600, 15000, 15000,
>> 15000,
>> > > > 15000,
>> > > > >      >      15000, 15000, 15000, 15000)), row.names = c(NA, -74L),
>> > class
>> > > > =
>> > > > >      > "data.frame")
>> > > > >      >
>> > > > >      > I would like to modify the following script:
>> > > > >      >
>> > > > >      >> with(datasetregs, Hist(Amount, groups=Date,
>> > scale="frequency",
>> > > > >      > +   breaks="Sturges", col="darkgray"))
>> > > > >      >
>> > > > >      > #The only thing missing here are the percentages
>> > corresponding to
>> > > > >     each bin
>> > > > >      > (I would like to see the percentages inside each column,
>> or on
>> > > > >     top outside
>> > > > >      > if possible)
>> > > > >      >
>> > > > >      > Any help will be greatly appreciated.
>> > > > >      >
>> > > > >      > Best regards,
>> > > > >      >
>> > > > >      > Paul.
>> > > > >      >
>> > > > >      >       [[alternative HTML version deleted]]
>> > > > >      >
>> > > > >      > ______________________________________________
>> > > > >      > R-help using r-project.org <mailto:R-help using r-project.org> mailing
>> > list
>> > > > >     -- To UNSUBSCRIBE and more, see
>> > > > >      > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > >     <https://stat.ethz.ch/mailman/listinfo/r-help>
>> > > > >      > PLEASE do read the posting guide
>> > > > >     http://www.R-project.org/posting-guide.html
>> > > > >     <http://www.R-project.org/posting-guide.html>
>> > > > >      > and provide commented, minimal, self-contained,
>> reproducible
>> > code.
>> > > > >      >
>> > > > >
>> > > >
>> > >
>> > >         [[alternative HTML version deleted]]
>> > >
>> > > ______________________________________________
>> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list