Discuss type consistency

Fixes #1109
This commit is contained in:
Hadley Wickham 2022-11-21 08:01:17 -06:00
parent 223e09a22b
commit c89aa20627
1 changed files with 27 additions and 1 deletions

View File

@ -346,7 +346,7 @@ flights |>
In most cases, however, `any()` and `all()` are a little too crude, and it would be nice to be able to get a little more detail about how many values are `TRUE` or `FALSE`.
That leads us to the numeric summaries.
### Numeric summaries of logical vectors
### Numeric summaries of logical vectors {#sec-numeric-summaries-of-logicals}
When you use a logical vector in a numeric context, `TRUE` becomes 1 and `FALSE` becomes 0.
This makes `sum()` and `mean()` very useful with logical vectors because `sum(x)` will give the number of `TRUE`s and `mean(x)` the proportion of `TRUE`s.
@ -545,6 +545,32 @@ flights |>
)
```
### Compatible types
Note that both `if_else()` and `case_when()` require **compatible** types in the output.
If they're not compatible, you'll see errors like this:
```{r}
#| error: true
if_else(TRUE, "a", 1)
case_when(
x < -1 ~ TRUE,
x > 0 ~ lubridate::now()
)
```
Overall, relatively few types are compatible, because automatically converting one type of vector to another is a common source of errors.
Here are the most important cases that are compatible:
- Numeric and logical vectors are compatible, as we discussed in @sec-numeric-summaries-of-logicals.
- Strings and factors (@sec-factors) are compatible, because you can think of a factor as a string with a restricted set of values.
- Dates and date-times, which we'll discuss in @sec-dates-and-times, are compatible because you can think of a date as a special case of date-time.
- `NA`, which is technically a logical vector, is compatible with everything because every vector has some way of representing a missing value.
We don't expect you to memorize these rules, but they should become second nature over time because they are applied consistently throughout the tidyverse.
## Summary
The definition of a logical vector is simple because each value must be either `TRUE`, `FALSE`, or `NA`.