Typos in transform & EDA (#209)
* little different to -> a little differently from * reorder to match order in following code/table * capitalization * capitalization + punctuation * replace / with or since it's hard to see between tt formatted code * missing pronoun * they're been -> they've been * lets need apostrophe * Instead of display -> Instead of displaying * missing comma * add mapping in front of aes in a bunch of locations * adding mapping before aes for sections before the last section where it explicitly says from here on out we'll omit them to make calls simpler
This commit is contained in:
committed by
Hadley Wickham
parent
8f087e8ce3
commit
fe73722b0a
@@ -24,7 +24,7 @@ To explore the basic data manipulation verbs of dplyr, we'll use `nycflights13::
|
||||
flights
|
||||
```
|
||||
|
||||
You might notice that this data frame prints little differently to other data frames you might have used in the past: it only shows the first few rows and all the columns that fit on one screen. (To see the whole dataset, you can run `View(flights)` which will open the dataset in the RStudio viewer). It prints differently because it's a __tibble__. Tibbles are data frames, but slightly tweaked to work better in the tidyverse. For now, you don't need to worry about the differences; we'll come back to tibbles in more detail in [wrangle](#wrangle-intro).
|
||||
You might notice that this data frame prints a little differently from other data frames you might have used in the past: it only shows the first few rows and all the columns that fit on one screen. (To see the whole dataset, you can run `View(flights)` which will open the dataset in the RStudio viewer). It prints differently because it's a __tibble__. Tibbles are data frames, but slightly tweaked to work better in the tidyverse. For now, you don't need to worry about the differences; we'll come back to tibbles in more detail in [wrangle](#wrangle-intro).
|
||||
|
||||
You might also have noticed the row of three letter abbreviations under the column names. These describe the type of each variable:
|
||||
|
||||
@@ -426,7 +426,7 @@ There are many functions for creating new variables that you can use with `mutat
|
||||
(e.g. 1st, 2nd, 2nd, 4th). The default gives smallest values the small
|
||||
ranks; use `desc(x)` to give the largest values the smallest ranks.
|
||||
If `min_rank()` doesn't do what you need, look at the variants
|
||||
`row_number()`, `dense_rank()`, `cume_dist()`, `percent_rank()`,
|
||||
`row_number()`, `dense_rank()`, `percent_rank()`, `cume_dist()`,
|
||||
`ntile()`.
|
||||
|
||||
```{r}
|
||||
@@ -481,7 +481,7 @@ The last key verb is `summarise()`. It collapses a data frame to a single row:
|
||||
summarise(flights, delay = mean(dep_delay, na.rm = TRUE))
|
||||
```
|
||||
|
||||
(we'll come back to what that `na.rm = TRUE` means very shortly.)
|
||||
(We'll come back to what that `na.rm = TRUE` means very shortly.)
|
||||
|
||||
`summarise()` is not terribly useful unless we pair it with `group_by()`. This changes the unit of analysis from the complete dataset to individual groups. Then, when you use the dplyr verbs on a grouped data frame they'll be automatically applied "by group". For example, if we applied exactly the same code to a data frame grouped by date, we get the average delay per date:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user