2.4 Creating a Histogram
2.4.2 Solution
To make a histogram (Figure 2.8), use hist()
and pass it a vector of values:
hist(mtcars$mpg)
# Specify approximate number of bins with breaks
hist(mtcars$mpg, breaks = 10)
With the ggplot2, you can get a similar result using geom_histogram()
(Figure 2.9):
library(ggplot2)
ggplot(mtcars, aes(x = mpg)) +
geom_histogram()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'mpg' with labels 10, 15, 20, 25, 30 and 35.
#> It has y-axis 'count' with labels 0, 1, 2, 3, 4 and 5.
#> The chart is a bar chart with 30 vertical bars.
# With wider bins
ggplot(mtcars, aes(x = mpg)) +
geom_histogram(binwidth = 4)
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'mpg' with labels 10, 15, 20, 25, 30 and 35.
#> It has y-axis 'count' with labels 0.0, 2.5, 5.0, 7.5 and 10.0.
#> The chart is a bar chart with 6 vertical bars.
#> Bar 1 is centered horizontally at 12, and spans vertically from 0 to 3.
#> Bar 2 is centered horizontally at 16, and spans vertically from 0 to 10.
#> Bar 3 is centered horizontally at 20, and spans vertically from 0 to 10.
#> Bar 4 is centered horizontally at 24, and spans vertically from 0 to 4.
#> Bar 5 is centered horizontally at 28, and spans vertically from 0 to 1.
#> Bar 6 is centered horizontally at 32, and spans vertically from 0 to 4.
When you create a histogram without specifying the bin width, ggplot()
prints out a message telling you that it’s defaulting to 30 bins, and to pick a better bin width. This is because it’s important to explore your data using different bin widths; the default of 30 may or may not show you something useful about your data.