3.2 Grouping Bars Together
3.2.2 Solution
Map a variable to fill, and use geom_col(position = "dodge")
.
In this example we’ll use the cabbage_exp
data set, which has two categorical variables, Cultivar
and Date
, and one continuous variable, Weight
:
library(gcookbook) # Load gcookbook for the cabbage_exp data set
cabbage_exp#> Cultivar Date Weight sd n se
#> 1 c39 d16 3.18 0.9566144 10 0.30250803
#> 2 c39 d20 2.80 0.2788867 10 0.08819171
#> 3 c39 d21 2.74 0.9834181 10 0.31098410
#> 4 c52 d16 2.26 0.4452215 10 0.14079141
#> 5 c52 d20 3.11 0.7908505 10 0.25008887
#> 6 c52 d21 1.47 0.2110819 10 0.06674995
We’ll map Date
to the x position and map Cultivar
to the fill color (Figure 3.4):
ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
geom_col(position = "dodge")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as strong reddish orange fill and
#> c52 shown as brilliant bluish green fill.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 0.78, and spans vertically from 0 to 3.18 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 2 is centered horizontally at 1.77, and spans vertically from 0 to 2.8 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 3 is centered horizontally at 2.78, and spans vertically from 0 to 2.74 with fill colour strong reddish orange which maps to Cultivar = c39.
#> Bar 4 is centered horizontally at 1.23, and spans vertically from 0 to 2.26 with fill colour brilliant bluish green which maps to Cultivar = c52.
#> Bar 5 is centered horizontally at 2.22, and spans vertically from 0 to 3.11 with fill colour brilliant bluish green which maps to Cultivar = c52.
#> Bar 6 is centered horizontally at 3.22, and spans vertically from 0 to 1.47 with fill colour brilliant bluish green which maps to Cultivar = c52.
3.2.3 Discussion
The most basic bar graphs have one categorical variable on the x-axis and one continuous variable on the y-axis. Sometimes you’ll want to use another categorical variable to divide up the data, in addition to the variable on the x-axis. You can produce a grouped bar plot by mapping that variable to fill, which represents the fill color of the bars. You must also use position = "dodge"
, which tells the bars to “dodge” each other horizontally; if you don’t, you’ll end up with a stacked bar plot (Recipe 3.7).
As with variables mapped to the x-axis of a bar graph, variables that are mapped to the fill color of bars must be categorical rather than continuous variables.
To add a black outline, use colour = "black"
inside geom_col()
. To set the colors, you can use scale_fill_brewer()
or scale_fill_manual()
. In Figure 3.5 we’ll use the Pastel1
palette from RColorBrewer
:
ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
geom_col(position = "dodge", colour = "black") +
scale_fill_brewer(palette = "Pastel1")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as vivid pink fill and
#> c52 shown as very pale blue fill.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 0.78, and spans vertically from 0 to 3.18 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 2 is centered horizontally at 1.77, and spans vertically from 0 to 2.8 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 3 is centered horizontally at 2.78, and spans vertically from 0 to 2.74 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 4 is centered horizontally at 1.23, and spans vertically from 0 to 2.26 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 5 is centered horizontally at 2.22, and spans vertically from 0 to 3.11 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 6 is centered horizontally at 3.22, and spans vertically from 0 to 1.47 with fill colour very pale blue which maps to Cultivar = c52.
#> It has colour set to black.
Other aesthetics, such as colour
(the color of the outlines of the bars) or linestyle
, can also be used for grouping variables, but fill
is probably what you’ll want to use.
Note that if there are any missing combinations of the categorical variables, that bar will be missing, and the neighboring bars will expand to fill that space. If we remove the last row from our example data frame, we get Figure 3.6:
<- cabbage_exp[1:5, ]
ce
ce#> Cultivar Date Weight sd n se
#> 1 c39 d16 3.18 0.9566144 10 0.30250803
#> 2 c39 d20 2.80 0.2788867 10 0.08819171
#> 3 c39 d21 2.74 0.9834181 10 0.31098410
#> 4 c52 d16 2.26 0.4452215 10 0.14079141
#> 5 c52 d20 3.11 0.7908505 10 0.25008887
ggplot(ce, aes(x = Date, y = Weight, fill = Cultivar)) +
geom_col(position = "dodge", colour = "black") +
scale_fill_brewer(palette = "Pastel1")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Date' with labels d16, d20 and d21.
#> It has y-axis 'Weight' with labels 0, 1, 2 and 3.
#> There is a legend indicating fill is used to show Cultivar, with 2 levels:
#> c39 shown as vivid pink fill and
#> c52 shown as very pale blue fill.
#> The chart is a bar chart with 5 vertical bars.
#> Bar 1 spans horizontally from 0.55 to 1, and spans vertically from 0 to 3.18 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 2 spans horizontally from 1.55 to 2, and spans vertically from 0 to 2.8 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 3 spans horizontally from 2.55 to 3.45, and spans vertically from 0 to 2.74 with fill colour vivid pink which maps to Cultivar = c39.
#> Bar 4 spans horizontally from 1 to 1.45, and spans vertically from 0 to 2.26 with fill colour very pale blue which maps to Cultivar = c52.
#> Bar 5 spans horizontally from 2 to 2.45, and spans vertically from 0 to 3.11 with fill colour very pale blue which maps to Cultivar = c52.
#> It has colour set to black.
If your data has this issue, you can manually make an entry for the missing factor level combination with an NA
for the y variable.