Fix/data-transform (#1398)

* fix wrong references, inconsistency between sentence and code, and typos * Update data-transform.qmd * Update data-transform.qmd * Update data-transform.qmd * Update logicals.qmd --------- Co-authored-by: Mine Cetinkaya-Rundel <cetinkaya.mine@gmail.com>
2023-04-10 12:21:13 +09:00 · 2023-04-10 12:21:13 +09:00 · e5a847f7b3
parent b9f4ad61c3
commit e5a847f7b3
1 changed files with 6 additions and 6 deletions
--- a/data-transform.qmd
+++ b/data-transform.qmd
@ -58,7 +58,7 @@ glimpse(flights)
 ```

 In both views, the variables names are followed by abbreviations that tell you the type of each variable: `<int>` is short for integer, `<dbl>` is short for double (aka real numbers), `<chr>` for character (aka strings), and `<dttm>` for date-time.
-These are important because the operations you can perform on a column depend so much on its "type", and these types are used to organize the chapters in the next section of the book.
+These are important because the operations you can perform on a column depend so much on its "type".

 ### dplyr basics

@ -102,7 +102,7 @@ We'll also discuss `distinct()` which finds rows with unique values but unlike `
 `filter()` allows you to keep rows based on the values of the columns[^data-transform-1].
 The first argument is the data frame.
 The second and subsequent arguments are the conditions that must be true to keep the row.
-For example, we could find all flights that arrived more than 120 minutes (two hours) late:
+For example, we could find all flights that departed more than 120 minutes (two hours) late:

 [^data-transform-1]: Later, you'll learn about the `slice_*()` family which allows you to choose rows based on their positions.

@ -225,7 +225,7 @@ flights |>

 ### Exercises

-1.  In a single pipeline, find all flights that meet all of the following conditions:
+1.  In a single pipeline, find all flights that meet each of the following conditions:

    -   Had an arrival delay of two or more hours
    -   Flew to Houston (`IAH` or `HOU`)
@ -251,7 +251,7 @@ flights |>

 ## Columns

-There are four important verbs that affect the columns without changing the rows: `mutate()` creates new columns that are derived from the existing columns, `select()` changes which columns are present; `rename()` changes the names of the columns; and `relocate()` changes the positions of the columns.
+There are four important verbs that affect the columns without changing the rows: `mutate()` creates new columns that are derived from the existing columns, `select()` changes which columns are present, `rename()` changes the names of the columns, and `relocate()` changes the positions of the columns.

 ### `mutate()` {#sec-mutate}

@ -479,7 +479,7 @@ flights |>
  arrange(desc(speed))
 ```

-Even though this pipeline has four steps, it's easy to skim because the verbs come at the start of each line: start with the `flights` data, then filter, then group, then summarize.
+Even though this pipeline has four steps, it's easy to skim because the verbs come at the start of each line: start with the `flights` data, then filter, then mutate, then select, then arrange.

 What would happen if we didn't have the pipe?
 We could nest each function call inside the previous call:
@ -575,7 +575,7 @@ This means subsequent operations will now work "by month".
 ### `summarize()` {#sec-summarize}

 The most important grouped operation is a summary, which, if being used to calculate a single summary statistic, reduces the data frame to have a single row for each group.
-In dplyr, this is operation is performed by `summarize()`[^data-transform-3], as shown by the following example, which computes the average departure delay by month:
+In dplyr, this operation is performed by `summarize()`[^data-transform-3], as shown by the following example, which computes the average departure delay by month:

 [^data-transform-3]: Or `summarise()`, if you prefer British English.