Suggestions for chapters on logical vectors and numbers (#1212)
This commit is contained in:
parent
b5d6735959
commit
69b2a265a8
|
@ -480,7 +480,7 @@ Instead, you can switch to `dplyr::case_when()`.
|
|||
|
||||
### `case_when()`
|
||||
|
||||
dplyr's `case_when()` is inspired by SQL's `CASE` statement and provides a flexible way of performing different computations for different computations.
|
||||
dplyr's `case_when()` is inspired by SQL's `CASE` statement and provides a flexible way of performing different computations for different conditions.
|
||||
It has a special syntax that unfortunately looks like nothing else you'll use in the tidyverse.
|
||||
It takes pairs that look like `condition ~ output`.
|
||||
`condition` must be a logical vector; when it's `TRUE`, `output` will be used.
|
||||
|
|
14
numbers.qmd
14
numbers.qmd
|
@ -18,6 +18,11 @@ We'll finish off by covering the summary functions that pair well with `summariz
|
|||
|
||||
### Prerequisites
|
||||
|
||||
::: callout-important
|
||||
This chapter relies on features only found in dplyr 1.1.0, which is still in development.
|
||||
If you want to live on the edge, you can get the dev versions with `devtools::install_github("tidyverse/dplyr")`.
|
||||
:::
|
||||
|
||||
This chapter mostly uses functions from base R, which are available without loading any packages.
|
||||
But we still need the tidyverse because we'll use these base R functions inside of tidyverse functions like `mutate()` and `filter()`.
|
||||
Like in the last chapter, we'll use real examples from nycflights13, as well as toy examples made with `c()` and `tribble()`.
|
||||
|
@ -395,7 +400,7 @@ cut(y, breaks = c(0, 5, 10, 15, 20))
|
|||
|
||||
See the documentation for other useful arguments like `right` and `include.lowest`, which control if the intervals are `[a, b)` or `(a, b]` and if the lowest interval should be `[a, b]`.
|
||||
|
||||
### Cumulative and rolling aggregates
|
||||
### Cumulative and rolling aggregates {#sec-cumulative-and-rolling-aggregates}
|
||||
|
||||
Base R provides `cumsum()`, `cumprod()`, `cummin()`, `cummax()` for running, or cumulative, sums, products, mins and maxes.
|
||||
dplyr provides `cummean()` for cumulative means.
|
||||
|
@ -544,16 +549,15 @@ events
|
|||
```
|
||||
|
||||
But how do we go from that logical vector to something that we can `group_by()`?
|
||||
`consecutive_id()` comes to the rescue:
|
||||
`cumsum()` from @sec-cumulative-and-rolling-aggregates comes to the rescue as each occurring gap, i.e., `gap` is `TRUE`, increments `group` by one (see @sec-numeric-summaries-of-logicals on the numerical interpretation of logicals):
|
||||
|
||||
```{r}
|
||||
events |> mutate(
|
||||
group = consecutive_id(gap)
|
||||
group = cumsum(gap)
|
||||
)
|
||||
```
|
||||
|
||||
`consecutive_id()` starts a new group every time one of its arguments changes.
|
||||
That makes it useful both here, with logical vectors, and in many other place.
|
||||
Another approach for creating grouping variables is `consecutive_id()`, which starts a new group every time one of its arguments changes.
|
||||
For example, inspired by [this stackoverflow question](https://stackoverflow.com/questions/27482712), imagine you have a data frame with a bunch of repeated values:
|
||||
|
||||
```{r}
|
||||
|
|
Loading…
Reference in New Issue