Fix/databases probably typos (#1474)
* typos * probably typos * a typo * typos * a typo
This commit is contained in:
		@@ -16,7 +16,7 @@ You want to be able to reach into the database directly to get the data you need
 | 
			
		||||
In this chapter, you'll first learn the basics of the DBI package: how to use it to connect to a database and then retrieve data with a SQL[^databases-1] query.
 | 
			
		||||
**SQL**, short for **s**tructured **q**uery **l**anguage, is the lingua franca of databases, and is an important language for all data scientists to learn.
 | 
			
		||||
That said, we're not going to start with SQL, but instead we'll teach you dbplyr, which can translate your dplyr code to the SQL.
 | 
			
		||||
We'll use that as way to teach you some of the most important features of SQL.
 | 
			
		||||
We'll use that as a way to teach you some of the most important features of SQL.
 | 
			
		||||
You won't become a SQL master by the end of the chapter, but you will be able to identify the most important components and understand what they do.
 | 
			
		||||
 | 
			
		||||
[^databases-1]: SQL is either pronounced "s"-"q"-"l" or "sequel".
 | 
			
		||||
@@ -37,7 +37,7 @@ library(tidyverse)
 | 
			
		||||
## Database basics
 | 
			
		||||
 | 
			
		||||
At the simplest level, you can think about a database as a collection of data frames, called **tables** in database terminology.
 | 
			
		||||
Like a data.frame, a database table is a collection of named columns, where every value in the column is the same type.
 | 
			
		||||
Like a data frame, a database table is a collection of named columns, where every value in the column is the same type.
 | 
			
		||||
There are three high level differences between data frames and database tables:
 | 
			
		||||
 | 
			
		||||
-   Database tables are stored on disk and can be arbitrarily large.
 | 
			
		||||
@@ -66,7 +66,7 @@ To connect to the database from R, you'll use a pair of packages:
 | 
			
		||||
-   You'll also use a package tailored for the DBMS you're connecting to.
 | 
			
		||||
    This package translates the generic DBI commands into the specifics needed for a given DBMS.
 | 
			
		||||
    There's usually one package for each DBMS, e.g.
 | 
			
		||||
    RPostgres for Postgres and RMariaDB for MySQL.
 | 
			
		||||
    RPostgres for PostgreSQL and RMariaDB for MySQL.
 | 
			
		||||
 | 
			
		||||
If you can't find a specific package for your DBMS, you can usually use the odbc package instead.
 | 
			
		||||
This uses the ODBC protocol supported by many DBMS.
 | 
			
		||||
@@ -94,7 +94,7 @@ con <- DBI::dbConnect(
 | 
			
		||||
The precise details of the connection vary a lot from DBMS to DBMS so unfortunately we can't cover all the details here.
 | 
			
		||||
This means you'll need to do a little research on your own.
 | 
			
		||||
Typically you can ask the other data scientists in your team or talk to your DBA (**d**ata**b**ase **a**dministrator).
 | 
			
		||||
The initial setup will often take a little fiddling (and maybe some googling) to get right, but you'll generally only need to do it once.
 | 
			
		||||
The initial setup will often take a little fiddling (and maybe some googling) to get it right, but you'll generally only need to do it once.
 | 
			
		||||
 | 
			
		||||
### In this book
 | 
			
		||||
 | 
			
		||||
@@ -110,7 +110,7 @@ con <- DBI::dbConnect(duckdb::duckdb())
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
duckdb is a high-performance database that's designed very much for the needs of a data scientist.
 | 
			
		||||
We use it here because it's very to easy to get started with, but it's also capable of handling gigabytes of data with great speed.
 | 
			
		||||
We use it here because it's very easy to get started with, but it's also capable of handling gigabytes of data with great speed.
 | 
			
		||||
If you want to use duckdb for a real data analysis project, you'll also need to supply the `dbdir` argument to make a persistent database and tell duckdb where to save it.
 | 
			
		||||
Assuming you're using a project (@sec-workflow-scripts-projects), it's reasonable to store it in the `duckdb` directory of the current project:
 | 
			
		||||
 | 
			
		||||
@@ -301,7 +301,7 @@ The following sections explore each clause in more detail.
 | 
			
		||||
 | 
			
		||||
::: callout-note
 | 
			
		||||
Note that while SQL is a standard, it is extremely complex and no database follows it exactly.
 | 
			
		||||
While the main components that we'll focus on in this book are very similar between DBMSs, there are many minor variations.
 | 
			
		||||
While the main components that we'll focus on in this book are very similar between DBMS's, there are many minor variations.
 | 
			
		||||
Fortunately, dbplyr is designed to handle this problem and generates different translations for different databases.
 | 
			
		||||
It's not perfect, but it's continually improving, and if you hit a problem you can file an issue [on GitHub](https://github.com/tidyverse/dbplyr/issues/) to help us do better.
 | 
			
		||||
:::
 | 
			
		||||
@@ -426,7 +426,7 @@ flights |>
 | 
			
		||||
  summarize(delay = mean(arr_delay))
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
If you want to learn more about how NULLs work, you might enjoy "[*Three valued logic*](https://modern-sql.com/concept/three-valued-logic)" by Markus Winand.
 | 
			
		||||
If you want to learn more about how `NULL`s work, you might enjoy "[*Three valued logic*](https://modern-sql.com/concept/three-valued-logic)" by Markus Winand.
 | 
			
		||||
 | 
			
		||||
In general, you can work with `NULL`s using the functions you'd use for `NA`s in R:
 | 
			
		||||
 | 
			
		||||
@@ -655,7 +655,7 @@ dbplyr's translations are certainly not perfect, and there are many R functions
 | 
			
		||||
 | 
			
		||||
In this chapter you learned how to access data from databases.
 | 
			
		||||
We focused on dbplyr, a dplyr "backend" that allows you to write the dplyr code you're familiar with, and have it be automatically translated to SQL.
 | 
			
		||||
We used that translation to teach you a little SQL; it's important to learn some SQL because it's *the* most commonly used language for working with data and knowing some will it easier for you to communicate with other data folks who don't use R.
 | 
			
		||||
We used that translation to teach you a little SQL; it's important to learn some SQL because it's *the* most commonly used language for working with data and knowing some will make it easier for you to communicate with other data folks who don't use R.
 | 
			
		||||
If you've finished this chapter and would like to learn more about SQL.
 | 
			
		||||
We have two recommendations:
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user