Update regexps (#1237)

2023-01-23 15:35:09 +01:00 · 2023-01-23 15:35:09 +01:00 · 01b8566680
parent d8688e8029
commit 01b8566680
3 changed files with 22 additions and 22 deletions
--- a/datetimes.qmd
+++ b/datetimes.qmd
@ -545,7 +545,7 @@ dyears(1)
 Durations always record the time span in seconds.
 Larger units are created by converting minutes, hours, days, weeks, and years to seconds: 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, and 7 days in a week.
 Larger time units are more problematic.
-A year is uses the "average" number of days in a year, i.e. 365.25.
+A year uses the "average" number of days in a year, i.e. 365.25.
 There's no way to convert a month to a duration, because there's just too much variation.

 You can add and multiply durations:
@ -565,15 +565,15 @@ last_year <- today() - dyears(1)
 However, because durations represent an exact number of seconds, sometimes you might get an unexpected result:

 ```{r}
-one_pm <- ymd_hms("2026-03-12 13:00:00", tz = "America/New_York")
+one_am <- ymd_hms("2026-03-08 01:00:00", tz = "America/New_York")

-one_pm
-one_pm + ddays(1)
+one_am
+one_am + ddays(1)
 ```

-Why is one day after 1pm March 12, 2pm March 13?
+Why is one day after 1am March 8, 2am March 9?
 If you look carefully at the date you might also notice that the time zones have changed.
-March 12 only has 23 hours because it's when DST starts, so if we add a full days worth of seconds we end up with a different time.
+March 8 only has 23 hours because it's when DST starts, so if we add a full days worth of seconds we end up with a different time.

 ### Periods

@ -582,8 +582,8 @@ Periods are time spans but don't have a fixed length in seconds, instead they wo
 That allows them to work in a more intuitive way:

 ```{r}
-one_pm
-one_pm + days(1)
+one_am
+one_am + days(1)
 ```

 Like durations, periods can be created with a number of friendly constructor functions.
@ -610,8 +610,8 @@ ymd("2024-01-01") + dyears(1)
 ymd("2024-01-01") + years(1)

 # Daylight Savings Time
-one_pm + ddays(1)
-one_pm + days(1)
+one_am + ddays(1)
+one_am + days(1)
 ```

 Let's use periods to fix an oddity related to our flight dates.
--- a/missing-values.qmd
+++ b/missing-values.qmd
@ -179,7 +179,7 @@ This brings us to another important way of revealing implicitly missing observat
 You'll learn more about joins in @sec-joins, but we wanted to quickly mention them to you here since you can often only know that values are missing from one dataset when you compare it another.

 `dplyr::anti_join(x, y)` is a particularly useful tool here because it selects only the rows in `x` that don't have a match in `y`.
-For example, we can use two `anti_join()`s reveal to reveal that we're missing information for four airports and 722 planes mentioned in `flights`:
+For example, we can use two `anti_join()`s to reveal that we're missing information for four airports and 722 planes mentioned in `flights`:

 ```{r}
 library(nycflights13)
--- a/regexps.qmd
+++ b/regexps.qmd
@ -252,8 +252,8 @@ These functions are naturally paired with `mutate()` when doing data cleaning, a
 ### Extract variables {#sec-extract-variables}

 The last function we'll discuss uses regular expressions to extract data out of one column into one or more new columns: `separate_wider_regex()`.
-It's a peer of the `separate_wider_location()` and `separate_wider_delim()` functions that you learned about in @sec-string-columns.
-These functions live in tidyr because the operates on (columns of) data frames, rather than individual vectors.
+It's a peer of the `separate_wider_position()` and `separate_wider_delim()` functions that you learned about in @sec-string-columns.
+These functions live in tidyr because they operate on (columns of) data frames, rather than individual vectors.

 Let's create a simple dataset to show how it works.
 Here we have some data derived from `babynames` where we have the name, gender, and age of a bunch of people in a rather weird format[^regexps-5]:
@ -377,9 +377,9 @@ str_view(fruit, "^a")
 str_view(fruit, "a$")
 ```

-It's tempting to think that `$` should matches the start of a string, because that's how we write dollar amounts, but it's not what regular expressions want.
+It's tempting to think that `$` should match the start of a string, because that's how we write dollar amounts, but it's not what regular expressions want.

-To force a regular expression to only the full string, anchor it with both `^` and `$`:
+To force a regular expression to match only the full string, anchor it with both `^` and `$`:

 ```{r}
 str_view(fruit, "apple")
@ -387,7 +387,7 @@ str_view(fruit, "^apple$")
 ```

 You can also match the boundary between words (i.e. the start or end of a word) with `\b`.
-This can be particularly when using RStudio's find and replace tool.
+This can be particularly useful when using RStudio's find and replace tool.
 For example, if to find all uses of `sum()`, you can search for `\bsum\b` to avoid matching `summarize`, `summary`, `rowsum` and so on:

 ```{r}
@ -496,7 +496,7 @@ But unlike algebra you're unlikely to remember the precedence rules for regexes,

 ### Grouping and capturing

-As well overriding operator precedence, parentheses have another important effect: they create **capturing groups** that allow you to use sub-components of the match.
+As well as overriding operator precedence, parentheses have another important effect: they create **capturing groups** that allow you to use sub-components of the match.

 The first way to use a capturing group is to refer back to it within a match with **back reference**: `\1` refers to the match contained in the first parenthesis, `\2` in the second parenthesis, and so on.
 For example, the following pattern finds all fruits that have a repeated pair of letters:
@ -594,7 +594,7 @@ This allows you control the so called regex flags and match various types of fix

 ### Regex flags {#sec-flags}

-There are a number of settings that can use to control the details of the regexp.
+There are a number of settings that can be used to control the details of the regexp.
 These settings are often called **flags** in other programming languages.
 In stringr, you can use these by wrapping the pattern in a call to `regex()`.
 The most useful flag is probably `ignore_case = TRUE` because it allows characters to match either their uppercase or lowercase forms:
@ -680,7 +680,7 @@ str_view("i İ ı I", coll("İ", ignore_case = TRUE, locale = "tr"))
 To put these ideas into practice we'll solve a few semi-authentic problems next.
 We'll discuss three general techniques:

-1.  checking you work by creating simple positive and negative controls
+1.  checking your work by creating simple positive and negative controls
 2.  combining regular expressions with Boolean algebra
 3.  creating complex patterns using string manipulation

@ -830,7 +830,7 @@ str_view(sentences, pattern)
 ```

 In this example, `cols` only contains numbers and letters so you don't need to worry about metacharacters.
-But in general, whenever you create create patterns from existing strings it's wise to run them through `str_escape()` to ensure they match literally.
+But in general, whenever you create patterns from existing strings it's wise to run them through `str_escape()` to ensure they match literally.

 ### Exercises

@ -862,10 +862,10 @@ There are three other particularly useful places where you might want to use a r
 -   `matches(pattern)` will select all variables whose name matches the supplied pattern.
    It's a "tidyselect" function that you can use anywhere in any tidyverse function that selects variables (e.g. `select()`, `rename_with()` and `across()`).

-   `pivot_longer()'s` `names_pattern` argument takes a vector of regular expressions, just like `separate_with_regex()`.
+-   `pivot_longer()'s` `names_pattern` argument takes a vector of regular expressions, just like `separate_wider_regex()`.
    It's useful when extracting data out of variable names with a complex structure

-   The `delim` argument in `separate_delim_longer()` and `separate_delim_wider()` usually matches a fixed string, but you can use `regex()` to make it match a pattern.
+-   The `delim` argument in `separate_longer_delim()` and `separate_wider_delim()` usually matches a fixed string, but you can use `regex()` to make it match a pattern.
    This is useful, for example, if you want to match a comma that is optionally followed by a space, i.e. `regex(", ?")`.

 ### Base R