2.3 Creating a Bar Graph
2.3.2 Solution
To make a bar graph of values (Figure 2.5, left), use barplot()
and pass it a vector of values for the height of each bar and (optionally) a vector of labels for each bar. If the vector has names for the elements, the names will automatically be used as labels:
# First, take a look at the BOD data
BOD#> Time demand
#> 1 1 8.3
#> 2 2 10.3
#> 3 3 19.0
#> 4 4 16.0
#> 5 5 15.6
#> 6 7 19.8
barplot(BOD$demand, names.arg = BOD$Time)
Sometimes “bar graph” refers to a graph where the bars represent the count of cases in each category. This is similar to a histogram, but with a discrete instead of continuous x-axis. To generate the count of each unique value in a vector, use the table()
function:
# There are 11 cases of the value 4, 7 cases of 6, and 14 cases of 8
table(mtcars$cyl)
#>
#> 4 6 8
#> 11 7 14
Then pass the table to barplot()
to generate the graph of counts:
# Generate a table of counts
barplot(table(mtcars$cyl))
With ggplot2, you can get a similar result using geom_col()
(Figure 2.6). To plot a bar graph of values, use geom_col()
. Notice the difference in the output when the x variable is continuous and when it is discrete:
library(ggplot2)
# Bar graph of values. This uses the BOD data frame, with the
# "Time" column for x values and the "demand" column for y values.
ggplot(BOD, aes(x = Time, y = demand)) +
geom_col()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'Time' with labels 2, 4 and 6.
#> It has y-axis 'demand' with labels 0, 5, 10, 15 and 20.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 1, and spans vertically from 0 to 8.3.
#> Bar 2 is centered horizontally at 2, and spans vertically from 0 to 10.3.
#> Bar 3 is centered horizontally at 3, and spans vertically from 0 to 19.
#> Bar 4 is centered horizontally at 4, and spans vertically from 0 to 16.
#> Bar 5 is centered horizontally at 5, and spans vertically from 0 to 15.6.
#> Bar 6 is centered horizontally at 7, and spans vertically from 0 to 19.8.
# Convert the x variable to a factor, so that it is treated as discrete
ggplot(BOD, aes(x = factor(Time), y = demand)) +
geom_col()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'factor(Time)' with labels 1, 2, 3, 4, 5 and 7.
#> It has y-axis 'demand' with labels 0, 5, 10, 15 and 20.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 1, and spans vertically from 0 to 8.3.
#> Bar 2 is centered horizontally at 2, and spans vertically from 0 to 10.3.
#> Bar 3 is centered horizontally at 3, and spans vertically from 0 to 19.
#> Bar 4 is centered horizontally at 4, and spans vertically from 0 to 16.
#> Bar 5 is centered horizontally at 5, and spans vertically from 0 to 15.6.
#> Bar 6 is centered horizontally at 7, and spans vertically from 0 to 19.8.
ggplot2 can also be used to plot the count of the number of data rows in each category (Figure 2.7, by using geom_bar()
instead of geom_col()
. Once again, notice the difference between a continuous x-axis and a discrete one. For some kinds of data, it may make more sense to convert the continuous x variable to a discrete one, with the factor()
function.
# Bar graph of counts This uses the mtcars data frame, with the "cyl" column for
# x position. The y position is calculated by counting the number of rows for
# each value of cyl.
ggplot(mtcars, aes(x = cyl)) +
geom_bar()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'cyl' with labels 3, 4, 5, 6, 7, 8 and 9.
#> It has y-axis 'count' with labels 0, 5 and 10.
#> The chart is a bar chart with 3 vertical bars.
#> Bar 1 is centered horizontally at 4, and spans vertically from 0 to 11.
#> Bar 2 is centered horizontally at 6, and spans vertically from 0 to 7.
#> Bar 3 is centered horizontally at 8, and spans vertically from 0 to 14.
# Bar graph of counts
ggplot(mtcars, aes(x = factor(cyl))) +
geom_bar()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'factor(cyl)' with labels 4, 6 and 8.
#> It has y-axis 'count' with labels 0, 5 and 10.
#> The chart is a bar chart with 3 vertical bars.
#> Bar 1 is centered horizontally at 4, and spans vertically from 0 to 11.
#> Bar 2 is centered horizontally at 6, and spans vertically from 0 to 7.
#> Bar 3 is centered horizontally at 8, and spans vertically from 0 to 14.
Note
In previous versions of ggplot2, the recommended way to create a bar graph of values was to use
geom_bar(stat = "identity")
. As of ggplot2 2.2.0, there is ageom_col()
function which does the same thing.
2.3.3 See Also
See Chapter 3 for more in-depth information about creating bar graphs.