Minor typos (#1033)

This commit is contained in:
Mine Cetinkaya-Rundel 2022-06-01 00:05:59 -04:00 committed by GitHub
parent b67ac1dbe8
commit dd64a615bd
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 11 additions and 9 deletions

View File

@ -250,7 +250,7 @@ The rest of the chapter will teach you a little about SQL through the lens of db
It's a rather non-traditional introduction to SQL but I hope it will get you quickly up to speed with the basics.
It will hopefully help you understand the parallels between SQL and dplyr but it's not going to give you much practice writing SQL.
For that, I'd recommend [*SQL for Data Scientists*](https://sqlfordatascientists.com)by Renée M. P. Teate.
It's an introduction to SQL designed specifically for the needs of data scientists, and includes examples of the sort of highly interconnected data you're likely to encounter in real organisations.
It's an introduction to SQL designed specifically for the needs of data scientists, and includes examples of the sort of highly interconnected data you're likely to encounter in real organizations.
Luckily, if you understand dplyr you're in a great place to quickly pick up SQL because so many of the concepts are the same.
@ -306,7 +306,7 @@ flights |>
There are two important differences between dplyr verbs and SELECT clauses:
- SQL, unlike R, is **case** **insensitive** so you can write `select`, `SELECT`, or even `SeLeCt`. In this book we'll stick with the common convention of writing SQL keywords in uppercase to distinguish them from table or variables names.
- In SQL, order matters. Unlike dplyr, where you can call the verbs in whatever order makes the most sense to you, SQL clauses must come in a specific order: `SELECT`, `FROM`, `WHERE`, `GROUP BY`, `ORDER BY`. Confusingly, this order doesn't match how they are actually evaluated which is `FROM`, `WHERE`, `GROUP BY`, `SELECT`, `ORDER BY`.
- In SQL, order matters. Unlike dplyr, where you can call the verbs in whatever order makes the most sense to you, SQL clauses must come in a specific order: `SELECT`, `FROM`, `WHERE`, `GROUP BY`, `ORDER BY`. Confusingly, this order doesn't match how they are actually evaluated, which is `FROM`, `WHERE`, `GROUP BY`, `SELECT`, `ORDER BY`.
The following sections will explore each clause in more detail.
@ -337,8 +337,9 @@ flights |>
show_query()
```
This example also shows you have SQL does renaming.
In SQL terminology is called **aliasing** and is done with `AS`; just note that unlike with `mutate()`, the old name is on the left and the new name is on the right.
This example also shows you how SQL does renaming.
In SQL terminology renaming is called **aliasing** and is done with `AS`.
Note that unlike with `mutate()`, the old name is on the left and the new name is on the right.
The translations for `mutate()` are similarly straightforward.
We'll come back to the translation of individual components in @sec-sql-expressions.
@ -370,7 +371,7 @@ But only a handle of DBMS clients, like duckdb, actually know the complete list
### GROUP BY
When pared with `group_by()`, `summarise()` is also translated to `SELECT`:
When paired with `group_by()`, `summarise()` is also translated to `SELECT`:
```{r}
diamonds_db |>
@ -412,7 +413,7 @@ flights |>
show_query()
```
SQL doesn't have `NA`s, but instead has `NULL`s.
SQL doesn't have `NA`s, but instead it has `NULL`s.
They behave very similarly to `NA`s, including their "infectious" properties.
```{r}
@ -421,7 +422,7 @@ flights |>
show_query()
```
This SQL illustrates one of the drawbacks of dbplyr: it doesn't always generate the simplest SQL.
This SQL query illustrates one of the drawbacks of dbplyr: it doesn't always generate the simplest SQL.
In this case, the parentheses are redundant and you could use the special form `IS NOT NULL` yielding:
``` sql
@ -442,8 +443,9 @@ Note that `desc()` becomes `DESC`; this is another R function whose named was di
### Subqueries
Sometimes you'll notice that dbplyr generates more than one SELECT it's not possible to express what you want in a single query.
For example, in `SELECT` can only refer to columns that exist in the `FROM`, not columns that you have just created.
Sometimes it's not possible to express what you want in a single query.
For example, in `SELECT` you can only refer to columns that exist in the `FROM`, not columns that you have just created.
So if you modify a column that you just created, dbplyr will need to create a subquery:
```{r}