Suggestions for chapters on logical vectors and numbers (#1212)

2023-01-04 13:31:49 -08:00 · 2023-01-04 13:31:49 -08:00 · 69b2a265a8
parent b5d6735959
commit 69b2a265a8
2 changed files with 10 additions and 6 deletions
--- a/logicals.qmd
+++ b/logicals.qmd
@ -480,7 +480,7 @@ Instead, you can switch to `dplyr::case_when()`.

 ### `case_when()`

-dplyr's `case_when()` is inspired by SQL's `CASE` statement and provides a flexible way of performing different computations for different computations.
+dplyr's `case_when()` is inspired by SQL's `CASE` statement and provides a flexible way of performing different computations for different conditions.
 It has a special syntax that unfortunately looks like nothing else you'll use in the tidyverse.
 It takes pairs that look like `condition ~ output`.
 `condition` must be a logical vector; when it's `TRUE`, `output` will be used.
--- a/numbers.qmd
+++ b/numbers.qmd
@ -18,6 +18,11 @@ We'll finish off by covering the summary functions that pair well with `summariz

 ### Prerequisites

+::: callout-important
+This chapter relies on features only found in dplyr 1.1.0, which is still in development.
+If you want to live on the edge, you can get the dev versions with `devtools::install_github("tidyverse/dplyr")`.
+:::
+
 This chapter mostly uses functions from base R, which are available without loading any packages.
 But we still need the tidyverse because we'll use these base R functions inside of tidyverse functions like `mutate()` and `filter()`.
 Like in the last chapter, we'll use real examples from nycflights13, as well as toy examples made with `c()` and `tribble()`.
@ -395,7 +400,7 @@ cut(y, breaks = c(0, 5, 10, 15, 20))

 See the documentation for other useful arguments like `right` and `include.lowest`, which control if the intervals are `[a, b)` or `(a, b]` and if the lowest interval should be `[a, b]`.

-### Cumulative and rolling aggregates
+### Cumulative and rolling aggregates {#sec-cumulative-and-rolling-aggregates}

 Base R provides `cumsum()`, `cumprod()`, `cummin()`, `cummax()` for running, or cumulative, sums, products, mins and maxes.
 dplyr provides `cummean()` for cumulative means.
@ -544,16 +549,15 @@ events
 ```

 But how do we go from that logical vector to something that we can `group_by()`?
-`consecutive_id()` comes to the rescue:
+`cumsum()` from @sec-cumulative-and-rolling-aggregates comes to the rescue as each occurring gap, i.e., `gap` is `TRUE`, increments `group` by one (see @sec-numeric-summaries-of-logicals on the numerical interpretation of logicals):

 ```{r}
 events |> mutate(
-  group = consecutive_id(gap)
+  group = cumsum(gap)
 )
 ```

-`consecutive_id()` starts a new group every time one of its arguments changes.
-That makes it useful both here, with logical vectors, and in many other place.
+Another approach for creating grouping variables is `consecutive_id()`, which starts a new group every time one of its arguments changes.
 For example, inspired by [this stackoverflow question](https://stackoverflow.com/questions/27482712), imagine you have a data frame with a bunch of repeated values:

 ```{r}