2022-02-16 04:49:05 +08:00
# Workflow: Pipes {#workflow-pipes}
2021-04-21 21:25:39 +08:00
2022-02-16 05:34:31 +08:00
```{r, results = "asis", echo = FALSE}
status("restructuring")
```
2022-02-16 04:49:05 +08:00
## Introduction
2021-04-21 21:25:39 +08:00
2022-02-16 04:49:05 +08:00
Pipes are a powerful tool for clearly expressing a sequence of multiple operations.
2022-02-16 05:34:31 +08:00
We briefly introduced them in the previous chapter but before going too much farther I wanted to explain a little more about how they work and give a splash of history.
2021-04-21 21:25:39 +08:00
2022-02-16 04:49:05 +08:00
### Prerequisites
2021-04-21 21:25:39 +08:00
2022-02-16 05:34:31 +08:00
The pipe `|>` is built into R itself so you don't need anything else 😄.
But we'll also discuss another historically important pipe, `%>%`, which is provided by the core tidyverse package magrittr.
2021-04-21 21:25:39 +08:00
2022-02-16 04:49:05 +08:00
```{r setup, message = FALSE}
2022-02-16 05:34:31 +08:00
library(tidyverse)
2022-02-16 04:49:05 +08:00
```
2021-04-21 21:25:39 +08:00
2022-02-16 05:34:31 +08:00
## Why use a pipe?
2021-04-21 21:25:39 +08:00
2022-02-16 04:49:05 +08:00
The point of the pipe is to help you write code in a way that is easier to read and understand.
2022-02-16 05:34:31 +08:00
Imagine you wanted to express the following sequence of actions as R code: find keys, unlock car, start car, drive to work, park.
You could write it as nested function calls:
2022-02-16 04:49:05 +08:00
```{r, eval = FALSE}
2022-02-16 05:34:31 +08:00
park(drive(start_car(find("keys")), to = "work"))
2022-02-16 04:49:05 +08:00
```
2022-02-16 05:34:31 +08:00
But writing it out using with the pipe gives it a more natural and easier to read structure:
2022-02-16 04:49:05 +08:00
```{r, eval = FALSE}
2022-02-16 05:34:31 +08:00
find("keys") |>
start_car() |>
drive(to = "work") |>
park()
2022-02-16 04:49:05 +08:00
```
2022-02-16 05:34:31 +08:00
Behind the scenes, the pipe actually transforms your code to the first form.
In other words, `x |> f(y)` is equivalent to `f(x, y)`.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
## magrittr and the `%>%` pipe
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
If you've been using the tidyverse for a while, you might be more familiar with `%>%` than `|>`.
`%>%` comes from the **magrittr** package by Stefan Milton Bache and has been available since 2014.
This pipe was so successful that in 2021 the base pipe, `|>`, added to R 4.1.0.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
`|>` is inspired by `%>%`, and the tidyverse team was involved in its design.
`|>` offers fewer features than `%>%`, but we largely believe this to be a feature.
`%>%` was an experiment and included many speculative features that seemed like a good idea at the time, but in hindsight added too much complexity relative to their advantages.
The development of the base pipe gave an us opportunity to reset back to the most useful core.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
## Changing the argument
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
There is one feature that `%>%` has that `|>` currently lacks: a very easy way to change which argument you pass the object to --- you just put a `.` where you want the object on the left of the pipe to go.
Ironically this is particularly important for many base functions which were designed well before the pipe existed.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
One particularly challenging example is extract a single column out of a data frame with `$`.
With `%>%` you can write the fairly straightforward:
2022-02-16 04:49:05 +08:00
```{r}
2022-02-16 05:34:31 +08:00
mtcars %>% .$cyl
2022-02-16 04:49:05 +08:00
```
2022-02-16 05:34:31 +08:00
But the base pipe requires the rather cryptic:
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
```{r}
mtcars |> (`$`)(cyl)
2022-02-16 04:49:05 +08:00
```
2022-02-16 05:34:31 +08:00
Fortunately, dplyr provides a way out of this common problem with `pull`:
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
```{r}
mtcars |> pull(cyl)
2022-02-16 04:49:05 +08:00
```
2022-02-16 05:34:31 +08:00
magrittr offers a number of other variations on the pipe that you might want to learn about.
We don't teach them here because none of them has been sufficiently popular that you could reasonable expect a randomly chosen R user to recognize them.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
In R 4.2, the base pipe will gain its own placeholder, `_`.
Must be named.
Doesn't solve problem above, but helps out in lots of other places.
2022-02-16 04:49:05 +08:00
2022-02-16 05:34:31 +08:00
Expect it to continue to evolve.