A.3 Building a Simple Plot
ggplot2 has a simple requirement for data structures: they must be stored in data frames, and each type of variable that is mapped to an aesthetic must be stored in its own column. In the simpledat
examples we looked at earlier, we first mapped one variable to the x aesthetic and another to the fill aesthetic; then we changed the mapping specification to change which variable was mapped to which aesthetic.
We’ll walk through a simple example here. First, we’ll make a data frame of some sample data:
<- data.frame(
dat xval = 1:4,
yval=c(3, 5, 6, 9),
group=c("A","B","A","B")
)
dat#> xval yval group
#> 1 1 3 A
#> 2 2 5 B
#> 3 3 6 A
#> 4 4 9 B
A basic ggplot()
specification looks like this.
ggplot(dat, aes(x = xval, y = yval))
This creates a ggplot object using the data frame dat
. It also specifies default aesthetic mappings within aes()
:
x = xval
maps the column xval to the x position.y = yval
maps the column yval to the y position.
After we’ve given ggplot the data frame and the aesthetic mappings, there’s one more critical component: we need to tell it what geometric objects to add. At this point, ggplot2 doesn’t know if we want bars, lines, points, or something else to be drawn on the graph. We’ll add geom_point()
to draw points, resulting in a scatter plot (Figure A.7):
ggplot(dat, aes(x = xval, y = yval)) +
geom_point()
If you’re going to reuse some of these components, you canstore them in variables. We can save the ggplot object in p, and then add geom_point()
to it. This has the same effect as the preceding code:
<- ggplot(dat, aes(x = xval, y = yval))
p
+
p geom_point()
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3),
#> (2, 5),
#> (3, 6) and
#> (4, 9)
We can also map the variable group
to the color of the points, by putting aes()
inside the call to geom_point()
, and specifying colour = group
(Figure A.8):
+
p geom_point(aes(colour = group))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> There is a legend indicating colour is used to show group, with 2 levels:
#> A shown as strong reddish orange colour and
#> B shown as brilliant bluish green colour.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3) colour strong reddish orange which maps to group = A,
#> (2, 5) colour brilliant bluish green which maps to group = B,
#> (3, 6) colour strong reddish orange which maps to group = A and
#> (4, 9) colour brilliant bluish green which maps to group = B
This doesn’t alter the default aesthetic mappings that we defined previously, inside of ggplot(...)
. What it does is add an aesthetic mapping for this particular geom, geom_point()
. If we added other geoms, this mapping would not apply to them.
Contrast this aesthetic mapping with aesthetic setting. This time, we won’t use aes()
; we’ll just set the value of colour directly (Figure A.9):
+
p geom_point(colour = "blue")
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3),
#> (2, 5),
#> (3, 6) and
#> (4, 9)
#> It has colour set to vivid violet.
We can also modify the scales; that is, the mappings from data to visual attributes. Here, we’ll change the x scale so that it has a larger range (Figure A.10):
+
p geom_point() +
scale_x_continuous(limits = c(0, 8))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 0, 2, 4, 6 and 8.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3),
#> (2, 5),
#> (3, 6) and
#> (4, 9)
If we go back to the example with the colour = group
mapping, we can also modify the color scale:
+
p geom_point(aes(colour = group)) +
scale_colour_manual(values = c("orange", "forestgreen"))
#> This is an untitled chart with no subtitle or caption.
#> It has x-axis 'xval' with labels 1, 2, 3 and 4.
#> It has y-axis 'yval' with labels 4, 6 and 8.
#> There is a legend indicating colour is used to show group, with 2 levels:
#> A shown as vivid orange yellow colour and
#> B shown as vivid yellowish green colour.
#> The chart is a set of 4 points.
#> The points are at:
#> (1, 3) colour vivid orange yellow which maps to group = A,
#> (2, 5) colour vivid yellowish green which maps to group = B,
#> (3, 6) colour vivid orange yellow which maps to group = A and
#> (4, 9) colour vivid yellowish green which maps to group = B
Both times when we modified the scale, the guide also changed. With the x scale, the guide was the markings along the x-axis. With the color scale, the guide was the legend.
Notice that we’ve used +
to join together the pieces. In this last example, we ended a line with +
, then added more on the next line. If you are going to have multiple lines, you have to put the +
at the end of each line, instead of at the beginning of the next line. Otherwise, R’s parser won’t know that there’s more stuff coming; it’ll think you’ve finished the expression and evaluate it.