3.2 Grouping Bars Together

3.2.1 Problem

You want to group bars together by a second variable.

3.2.2 Solution

Map a variable to fill, and use geom_col(position = "dodge").

In this example we’ll use the cabbage_exp data set, which has two categorical variables, Cultivar and Date, and one continuous variable, Weight:

library(gcookbook)  # Load gcookbook for the cabbage_exp data set
cabbage_exp
#>   Cultivar Date Weight        sd  n         se
#> 1      c39  d16   3.18 0.9566144 10 0.30250803
#> 2      c39  d20   2.80 0.2788867 10 0.08819171
#> 3      c39  d21   2.74 0.9834181 10 0.31098410
#> 4      c52  d16   2.26 0.4452215 10 0.14079141
#> 5      c52  d20   3.11 0.7908505 10 0.25008887
#> 6      c52  d21   1.47 0.2110819 10 0.06674995

We’ll map Date to the x position and map Cultivar to the fill color (Figure 3.4):

ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
  geom_col(position = "dodge")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as strong reddish orange fill and 
#> c52 shown as brilliant bluish green fill.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 0.78, and spans vertically from 0 to 3.18 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 2 is centered horizontally at 1.77, and spans vertically from 0 to 2.8 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 3 is centered horizontally at 2.78, and spans vertically from 0 to 2.74 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 4 is centered horizontally at 1.23, and spans vertically from 0 to 2.26 with fill colour brilliant bluish green which maps to Cultivar = c52.
#> Bar 5 is centered horizontally at 2.22, and spans vertically from 0 to 3.11 with fill colour brilliant bluish green which maps to Cultivar = c52.
#> Bar 6 is centered horizontally at 3.22, and spans vertically from 0 to 1.47 with fill colour brilliant bluish green which maps to Cultivar = c52.
Graph with grouped bars

Figure 3.4: Graph with grouped bars

3.2.3 Discussion

The most basic bar graphs have one categorical variable on the x-axis and one continuous variable on the y-axis. Sometimes you’ll want to use another categorical variable to divide up the data, in addition to the variable on the x-axis. You can produce a grouped bar plot by mapping that variable to fill, which represents the fill color of the bars. You must also use position = "dodge", which tells the bars to “dodge” each other horizontally; if you don’t, you’ll end up with a stacked bar plot (Recipe 3.7).

As with variables mapped to the x-axis of a bar graph, variables that are mapped to the fill color of bars must be categorical rather than continuous variables.

To add a black outline, use colour = "black" inside geom_col(). To set the colors, you can use scale_fill_brewer() or scale_fill_manual(). In Figure 3.5 we’ll use the Pastel1 palette from RColorBrewer:

ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
  geom_col(position = "dodge", colour = "black") +
  scale_fill_brewer(palette = "Pastel1")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as vivid pink fill and 
#> c52 shown as very pale blue fill.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 0.78, and spans vertically from 0 to 3.18 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 2 is centered horizontally at 1.77, and spans vertically from 0 to 2.8 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 3 is centered horizontally at 2.78, and spans vertically from 0 to 2.74 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 4 is centered horizontally at 1.23, and spans vertically from 0 to 2.26 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 5 is centered horizontally at 2.22, and spans vertically from 0 to 3.11 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 6 is centered horizontally at 3.22, and spans vertically from 0 to 1.47 with fill colour very pale blue which maps to Cultivar = c52.
#> It has colour set to black.
Grouped bars with black outline and a different color palette

Figure 3.5: Grouped bars with black outline and a different color palette

Other aesthetics, such as colour (the color of the outlines of the bars) or linestyle, can also be used for grouping variables, but fill is probably what you’ll want to use.

Note that if there are any missing combinations of the categorical variables, that bar will be missing, and the neighboring bars will expand to fill that space. If we remove the last row from our example data frame, we get Figure 3.6:

ce <- cabbage_exp[1:5, ]
ce
#>   Cultivar Date Weight        sd  n         se
#> 1      c39  d16   3.18 0.9566144 10 0.30250803
#> 2      c39  d20   2.80 0.2788867 10 0.08819171
#> 3      c39  d21   2.74 0.9834181 10 0.31098410
#> 4      c52  d16   2.26 0.4452215 10 0.14079141
#> 5      c52  d20   3.11 0.7908505 10 0.25008887

ggplot(ce, aes(x = Date, y = Weight, fill = Cultivar)) +
  geom_col(position = "dodge", colour = "black") +
  scale_fill_brewer(palette = "Pastel1")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as vivid pink fill and 
#> c52 shown as very pale blue fill.
#> The chart is a bar chart with 5 vertical bars.
#> Bar 1 spans horizontally from 0.55 to 1, and spans vertically from 0 to 3.18 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 2 spans horizontally from 1.55 to 2, and spans vertically from 0 to 2.8 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 3 spans horizontally from 2.55 to 3.45, and spans vertically from 0 to 2.74 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 4 spans horizontally from 1 to 1.45, and spans vertically from 0 to 2.26 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 5 spans horizontally from 2 to 2.45, and spans vertically from 0 to 3.11 with fill colour very pale blue which maps to Cultivar = c52.
#> It has colour set to black.
Graph with a missing bar-the other bar fills the space

Figure 3.6: Graph with a missing bar-the other bar fills the space

If your data has this issue, you can manually make an entry for the missing factor level combination with an NA for the y variable.

3.2.4 See Also

For more on using colors in bar graphs, see Recipe 3.4.

To reorder the levels of a factor based on the values of another variable, see Recipe ??.