A.3 Building a Simple Plot

ggplot2 has a simple requirement for data structures: they must be stored in data frames, and each type of variable that is mapped to an aesthetic must be stored in its own column. In the simpledat examples we looked at earlier, we first mapped one variable to the x aesthetic and another to the fill aesthetic; then we changed the mapping specification to change which variable was mapped to which aesthetic.

We’ll walk through a simple example here. First, we’ll make a data frame of some sample data:

dat <- data.frame(
  xval = 1:4,
  yval=c(3, 5, 6, 9),
  group=c("A","B","A","B")
)

dat
#>   xval yval group
#> 1    1    3     A
#> 2    2    5     B
#> 3    3    6     A
#> 4    4    9     B

A basic ggplot() specification looks like this.

ggplot(dat, aes(x = xval, y = yval))

This creates a ggplot object using the data frame dat. It also specifies default aesthetic mappings within aes():

  • x = xval maps the column xval to the x position.

  • y = yval maps the column yval to the y position.

After we’ve given ggplot the data frame and the aesthetic mappings, there’s one more critical component: we need to tell it what geometric objects to add. At this point, ggplot2 doesn’t know if we want bars, lines, points, or something else to be drawn on the graph. We’ll add geom_point() to draw points, resulting in a scatter plot (Figure A.7):

ggplot(dat, aes(x = xval, y = yval)) +
  geom_point()

If you’re going to reuse some of these components, you canstore them in variables. We can save the ggplot object in p, and then add geom_point() to it. This has the same effect as the preceding code:

p <- ggplot(dat, aes(x = xval, y = yval))

p +
  geom_point()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3), 
#> (2, 5), 
#> (3, 6) and 
#> (4, 9)
A basic scatter plot

Figure A.7: A basic scatter plot

We can also map the variable group to the color of the points, by putting aes() inside the call to geom_point(), and specifying colour = group (Figure A.8):

p +
  geom_point(aes(colour = group))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> There is a legend indicating colour is used to show group, with 2 levels:
#> A shown as strong reddish orange colour and 
#> B shown as brilliant bluish green colour.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3) colour strong reddish orange which maps to group = A, 
#> (2, 5) colour brilliant bluish green which maps to group = B, 
#> (3, 6) colour strong reddish orange which maps to group = A and 
#> (4, 9) colour brilliant bluish green which maps to group = B
A scatter plot with a variable mapped to colour

Figure A.8: A scatter plot with a variable mapped to colour

This doesn’t alter the default aesthetic mappings that we defined previously, inside of ggplot(...). What it does is add an aesthetic mapping for this particular geom, geom_point(). If we added other geoms, this mapping would not apply to them.

Contrast this aesthetic mapping with aesthetic setting. This time, we won’t use aes(); we’ll just set the value of colour directly (Figure A.9):

p +
  geom_point(colour = "blue")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3), 
#> (2, 5), 
#> (3, 6) and 
#> (4, 9)
#> It has colour set to vivid violet.
A scatter plot with colors set instead of mapped

Figure A.9: A scatter plot with colors set instead of mapped

We can also modify the scales; that is, the mappings from data to visual attributes. Here, we’ll change the x scale so that it has a larger range (Figure A.10):

p +
  geom_point() +
  scale_x_continuous(limits = c(0, 8))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 0, 2, 4, 6 and 8.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3), 
#> (2, 5), 
#> (3, 6) and 
#> (4, 9)
A scatter plot with increased x range

Figure A.10: A scatter plot with increased x range

If we go back to the example with the colour = group mapping, we can also modify the color scale:

p +
  geom_point(aes(colour = group)) +
  scale_colour_manual(values = c("orange", "forestgreen"))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> There is a legend indicating colour is used to show group, with 2 levels:
#> A shown as vivid orange yellow colour and 
#> B shown as vivid yellowish green colour.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3) colour vivid orange yellow which maps to group = A, 
#> (2, 5) colour vivid yellowish green which maps to group = B, 
#> (3, 6) colour vivid orange yellow which maps to group = A and 
#> (4, 9) colour vivid yellowish green which maps to group = B
A scatter plot with modified colors and a different palette

Figure A.11: A scatter plot with modified colors and a different palette

Both times when we modified the scale, the guide also changed. With the x scale, the guide was the markings along the x-axis. With the color scale, the guide was the legend.

Notice that we’ve used + to join together the pieces. In this last example, we ended a line with +, then added more on the next line. If you are going to have multiple lines, you have to put the + at the end of each line, instead of at the beginning of the next line. Otherwise, R’s parser won’t know that there’s more stuff coming; it’ll think you’ve finished the expression and evaluate it.