Fix 4 typos in joins (#1431)
This commit is contained in:
parent
cbcf1e0d8b
commit
c31137b0c6
10
joins.qmd
10
joins.qmd
|
@ -269,7 +269,7 @@ flights2 |>
|
|||
```
|
||||
|
||||
We get a lot of missing matches because our join is trying to use `tailnum` and `year` as a compound key.
|
||||
Both `flights` and `planes` have a `year` column but they mean different things: `flights$year` is year the flight occurred and `planes$year` is the year the plane was built.
|
||||
Both `flights` and `planes` have a `year` column but they mean different things: `flights$year` is the year the flight occurred and `planes$year` is the year the plane was built.
|
||||
We only want to join on `tailnum` so we need to provide an explicit specification with `join_by()`:
|
||||
|
||||
```{r}
|
||||
|
@ -627,7 +627,7 @@ df1 |>
|
|||
|
||||
If you are doing this deliberately, you can set `relationship = "many-to-many"`, as the warning suggests.
|
||||
|
||||
### Filtering joins {#sec-non-equi-joins}
|
||||
### Filtering joins
|
||||
|
||||
The number of matches also determines the behavior of the filtering joins.
|
||||
The semi-join keeps rows in `x` that have one or more matches in `y`, as in @fig-join-semi.
|
||||
|
@ -664,7 +664,7 @@ knitr::include_graphics("diagrams/join/semi.png", dpi = 270)
|
|||
knitr::include_graphics("diagrams/join/anti.png", dpi = 270)
|
||||
```
|
||||
|
||||
## Non-equi joins
|
||||
## Non-equi joins {#sec-non-equi-joins}
|
||||
|
||||
So far you've only seen equi-joins, joins where the rows match if the `x` key equals the `y` key.
|
||||
Now we're going to relax that restriction and discuss other ways of determining if a pair of rows match.
|
||||
|
@ -841,7 +841,7 @@ Overlap joins provide three helpers that use inequality joins to make it easier
|
|||
|
||||
Let's continue the birthday example to see how you might use them.
|
||||
There's one problem with the strategy we used above: there's no party preceding the birthdays Jan 1-9.
|
||||
So it might be better to to be explicit about the date ranges that each party spans, and make a special case for those early birthdays:
|
||||
So it might be better to be explicit about the date ranges that each party spans, and make a special case for those early birthdays:
|
||||
|
||||
```{r}
|
||||
parties <- tibble(
|
||||
|
@ -854,7 +854,7 @@ parties
|
|||
```
|
||||
|
||||
Hadley is hopelessly bad at data entry so he also wanted to check that the party periods don't overlap.
|
||||
One way to do this is by using a self-join to check to if any start-end interval overlap with another:
|
||||
One way to do this is by using a self-join to check if any start-end interval overlap with another:
|
||||
|
||||
```{r}
|
||||
parties |>
|
||||
|
|
Loading…
Reference in New Issue