[R] barplot that displays sums of values of 2 y colums grouped by different variables

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Tue Jan 16 00:01:52 CET 2018


It is not generally advisable to get too fancy with stat functions in 
ggplot... things can easily get more complicated than ggplot is ready to 
handle when it comes to calculations. It is better to create data that 
corresponds directly to the graphical representations you are mapping 
them to.

Read [1] for more on this philosophy.

[1] H. Wickham, Tidy Data, Journal of Statistical Software, vol. 59, no. 
10, pp. 123, Sep. 2014. http://www.jstatsoft.org/v59/i10/

#---
library(ggplot2) # ggplot
library(dplyr)   # `%>%`, group_by, summarise
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union
library(tidyr)   # gather

dta <- read.table( text =
"city n y
mon 100 200
tor 209 300
edm 98 87
mon 20 76
tor 50 96
edm 62 27
", header = TRUE )

dta2 <- (   dta
         %>% group_by( city )
         %>% summarise( n = sum( n )
                      , y = sum( y )
                      )
         %>% gather( Response, value, -city )
         )

ggplot( dta2, aes( x=city, y=value, fill = Response ) ) +
     geom_bar( stat="identity", position="dodge" )

#' ![](https://i.imgur.com/cosFf3B.png)
#---

On Mon, 15 Jan 2018, kenneth dyson wrote:

> I am trying to create a barplot displaying the sums of 2 columns of data 
> grouped by a variable. the data is set up like this:
>
> "city" "n" "y" <br>
> mon 100 200 <br>
> tor 209 300 <br>
> edm 98 87 <br>
> mon 20 76 <br>
> tor 50 96 <br>
> edm 62 27 <br>
>
> the resulting plot should have city as the x-axis, 2 bars per city, 1 
> representing the sum of "n" in that city, the other the sum of "y" in that 
> city.
>
> If possible also show the sum in each bar as a label?
>
> I aggregated the data into sums like this:
>
> sum_data <- aggregate(. ~ City,data=raw_data,sum)
>
> this gave me the sums per city as I wanted but for some reason 1 of the 
> cities is missing in the output.
>
> Using this code for the plot:
>
> ggplot(sum_data,aes(x = City,y = n)) + geom_bar(aes(fill = y),stat = 
> "identity",position = "dodge")
>
> gave be a bar plot with one bar per city showing the sum of y as a color 
> gradient. not what I expected given the "dodge" command in geom_bar.
>
> Thanks.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list