1.7 Chaining Functions Together With %>%, the Pipe Operator

1.7.1 Problem

You want to call one function, then pass the result to another function, and another, in a way that is easily readable.

1.7.2 Solution

Use %>%, the pipe operator. For example:

library(dplyr) # The pipe is provided by dplyr

morley # Look at the morley data set
#>     Expt Run Speed
#> 001    1   1   850
#> 002    1   2   740
#> 003    1   3   900
#>  ...<94 more rows>...
#> 098    5  18   800
#> 099    5  19   810
#> 100    5  20   870

morley %>%
  filter(Expt == 1) %>%
  summary()
#>       Expt        Run            Speed     
#>  Min.   :1   Min.   : 1.00   Min.   : 650  
#>  1st Qu.:1   1st Qu.: 5.75   1st Qu.: 850  
#>  Median :1   Median :10.50   Median : 940  
#>  Mean   :1   Mean   :10.50   Mean   : 909  
#>  3rd Qu.:1   3rd Qu.:15.25   3rd Qu.: 980  
#>  Max.   :1   Max.   :20.00   Max.   :1070

This takes the morley data set, passes it to the filter() function from dplyr, keeping only the rows of the data where Expt is equal to 1. Then that result is passed to the summary() function, which calculates some summary statistics on the data.

Without the pipe operator, the code above would be written like this:

summary(filter(morley, Expt == 1))

In this code, function calls are processed from the inside outward. From a mathematical viewpoint, this makes perfect sense, but from a readability viewpoint, this can be confusing and hard to read, especially when there are many nested function calls.

1.7.3 Discussion

This pattern, with the %>% operator, is widely used with tidyverse packages, because they contain many functions that do relatively small things. The idea is that these functions are building blocks that allow user to compose the function calls together to produce the desired result.

To illustrate what’s going on, here’s a simpler example of two equivalent pieces of code:

f(x)

# Equivalent to:
x %>% f()

The pipe operator in essence takes the thing that’s on the left, and places it as the first argument of the function call that’s on the right.

It can be used for multiple function calls, in a chain:

h(g(f(x)))

# Equivalent to:
x %>%
  f() %>%
  g() %>%
  h()

In a function chain, the lexical ordering of the function calls is the same as the order in which they’re computed.

If you want to store the final result, you can use the <- operator at the beginning. For example, this will replace the original x with the result of the function chain:

x <- x %>%
  f() %>%
  g() %>%
  h()

If there are additional arguments for the function calls, they will be shifted to the right when the pipe operator is used. Going back to code from the first example, these two are equivalent:

filter(morley, Expt == 1)

morley %>% filter(Expt == 1)

The pipe operator is actually from the magrittr package, but dplyr imports it and makes it available when you call library(dplyr)

1.7.4 See Also

For many more examples of how to use %>% in data manipulation, see Chapter ??.