2.5 Creating a Box Plot
2.5.2 Solution
To make a box plot (Figure 2.10), use plot()
and pass it a factor of x values and a vector of y values. When x is a factor (as opposed to a numeric vector), it will automatically create a box plot:
plot(ToothGrowth$supp, ToothGrowth$len)
If the two vectors are in the same data frame, you can also use the boxplot()
function with formula syntax. With this syntax, you can combine two variables on the x-axis, as in Figure 2.10:
# Formula syntax
boxplot(len ~ supp, data = ToothGrowth)
# Put interaction of two variables on x-axis
boxplot(len ~ supp + dose, data = ToothGrowth)
With the ggplot2 package, you can get a similar result (Figure 2.11), with geom_boxplot()
:
library(ggplot2)
ggplot(ToothGrowth, aes(x = supp, y = len)) +
geom_boxplot()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'supp' with labels OJ and VC.
#> It has y-axis 'len' with labels 10, 20 and 30.
#> The chart is a boxplot comprised of 2 boxes with whiskers.
#> There is a box at x=OJ.
#> It has median 22.7. The box goes from 15.52 to 25.73, and the whiskers extend to 8.2 and 30.9.
#> There are 0 outliers for this boxplot.
#> There is a box at x=VC.
#> It has median 16.5. The box goes from 11.2 to 23.1, and the whiskers extend to 4.2 and 33.9.
#> There are 0 outliers for this boxplot.
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'interaction(supp, dose)' with labels OJ.0.5, VC.0.5, OJ.1, VC.1, OJ.2 and VC.2.
#> It has y-axis 'len' with labels 10, 20 and 30.
#> The chart is a boxplot comprised of 6 boxes with whiskers.
#> There is a box at x=OJ.0.5.
#> It has median 12.25. The box goes from 9.7 to 16.18, and the whiskers extend to 8.2 and 21.5.
#> There are 0 outliers for this boxplot.
#> There is a box at x=VC.0.5.
#> It has median 7.15. The box goes from 5.95 to 10.9, and the whiskers extend to 4.2 and 11.5.
#> There are 0 outliers for this boxplot.
#> There is a box at x=OJ.1.
#> It has median 23.45. The box goes from 20.3 to 25.65, and the whiskers extend to 14.5 and 27.3.
#> There are 0 outliers for this boxplot.
#> There is a box at x=VC.1.
#> It has median 16.5. The box goes from 15.27 to 17.3, and the whiskers extend to 13.6 and 18.8.
#> There are 1 outliers for this boxplot.
#> There is a box at x=OJ.2.
#> It has median 25.95. The box goes from 24.58 to 27.08, and the whiskers extend to 22.4 and 29.4.
#> There are 1 outliers for this boxplot.
#> There is a box at x=VC.2.
#> It has median 25.95. The box goes from 23.38 to 28.8, and the whiskers extend to 18.5 and 33.9.
#> There are 0 outliers for this boxplot.
It’s also possible to make box plots for multiple variables, by combining the variables with interaction()
, as in Figure 2.11:
ggplot(ToothGrowth, aes(x = interaction(supp, dose), y = len)) +
geom_boxplot()
Note
You may have noticed that the box plots from base graphics are ever-so-slightly different from those from ggplot2. This is because they use slightly different methods for calculating quantiles. See
?geom_boxplot
and?boxplot.stats
for more information on how they differ.
2.5.3 See Also
For more on making basic box plots, see Recipe ??.