Update rectangling.qmd (#1069)
It is a very nicely written chapter. Here I provided a few corrections on typos and errors.
This commit is contained in:
		@@ -126,8 +126,8 @@ knitr::include_graphics("screenshots/View-3.png", dpi = 220)
 | 
				
			|||||||
### List-columns
 | 
					### List-columns
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Lists can also live inside a tibble, where we call them list-columns.
 | 
					Lists can also live inside a tibble, where we call them list-columns.
 | 
				
			||||||
List-columns are useful because they allow you to shoehorn in objects that wouldn't wouldn't usually belong in a tibble.
 | 
					List-columns are useful because they allow you to shoehorn in objects that wouldn't usually belong in a tibble.
 | 
				
			||||||
In particular, list-columns are are used a lot in the [tidymodels](https://www.tidymodels.org) ecosystem, because they allows you to store things like models or resamples in a data frame.
 | 
					In particular, list-columns are are used a lot in the [tidymodels](https://www.tidymodels.org) ecosystem, because they allow you to store things like models or resamples in a data frame.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Here's a simple example of a list-column:
 | 
					Here's a simple example of a list-column:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -187,7 +187,7 @@ It's easier to use list-columns with tibbles because `tibble()` treats lists lik
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
## Unnesting
 | 
					## Unnesting
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Now that you've learned the basics of lists and list-columns, lets explore how you can turn them back into regular rows and columns.
 | 
					Now that you've learned the basics of lists and list-columns, let's explore how you can turn them back into regular rows and columns.
 | 
				
			||||||
We'll start with very simple sample data so you can get the basic idea, and then switch to more realistic examples in the next section.
 | 
					We'll start with very simple sample data so you can get the basic idea, and then switch to more realistic examples in the next section.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
List-columns tend to come in two basic forms: named and unnamed.
 | 
					List-columns tend to come in two basic forms: named and unnamed.
 | 
				
			||||||
@@ -195,7 +195,7 @@ When the children are **named**, they tend to have the same names in every row.
 | 
				
			|||||||
When the children are **unnamed**, the number of elements tends to vary from row-to-row.
 | 
					When the children are **unnamed**, the number of elements tends to vary from row-to-row.
 | 
				
			||||||
The following code creates an example of each.
 | 
					The following code creates an example of each.
 | 
				
			||||||
In `df1`, every element of list-column `y` has two elements named `a` and `b`.
 | 
					In `df1`, every element of list-column `y` has two elements named `a` and `b`.
 | 
				
			||||||
If `df2`, the elements of list-column `y` are unnamed and vary in length.
 | 
					In `df2`, the elements of list-column `y` are unnamed and vary in length.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
df1 <- tribble(
 | 
					df1 <- tribble(
 | 
				
			||||||
@@ -316,7 +316,7 @@ You might wonder if this breaks the commandment that every element of a column m
 | 
				
			|||||||
What happens if you find this problem in a dataset you're trying to rectangle?
 | 
					What happens if you find this problem in a dataset you're trying to rectangle?
 | 
				
			||||||
There are two basic options.
 | 
					There are two basic options.
 | 
				
			||||||
You could use the `transform` argument to coerce all inputs to a common type.
 | 
					You could use the `transform` argument to coerce all inputs to a common type.
 | 
				
			||||||
It's not particularly useful here because there's only really one class that these five class can be converted to: character.
 | 
					It's not particularly useful here because there's only really one class that these five class can be converted to character.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
df4 |> 
 | 
					df4 |> 
 | 
				
			||||||
@@ -371,7 +371,7 @@ These are good to know about when you're other people's code and for tackling ra
 | 
				
			|||||||
## Case studies
 | 
					## Case studies
 | 
				
			||||||
 | 
					
 | 
				
			||||||
So far you've learned about the simplest case of list-columns, where rectangling only requires a single call to `unnest_longer()` or `unnest_wider()`.
 | 
					So far you've learned about the simplest case of list-columns, where rectangling only requires a single call to `unnest_longer()` or `unnest_wider()`.
 | 
				
			||||||
The main difference between real data and these simple examples is that real data typically containsmultiple levels of nesting that requires multiple calls to `unnest_longer()` and `unnest_wider()`.
 | 
					The main difference between real data and these simple examples is that real data typically contains multiple levels of nesting that require multiple calls to `unnest_longer()` and `unnest_wider()`.
 | 
				
			||||||
This section will work through four real rectangling challenges using datasets from the repurrrsive package that are inspired by datasets that we've encountered in the wild.
 | 
					This section will work through four real rectangling challenges using datasets from the repurrrsive package that are inspired by datasets that we've encountered in the wild.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
### Very wide data
 | 
					### Very wide data
 | 
				
			||||||
@@ -426,7 +426,7 @@ repos |>
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
You can use this to work back to understand how `gh_repos` was strucured: each child was a GitHub user containing a list of up to 30 GitHub repositories that they created.
 | 
					You can use this to work back to understand how `gh_repos` was strucured: each child was a GitHub user containing a list of up to 30 GitHub repositories that they created.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
`owner` is another list-column, and since it a contains a named list, we can use `unnest_wider()` to get at the values:
 | 
					`owner` is another list-column, and since it contains a named list, we can use `unnest_wider()` to get at the values:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
#| error: true
 | 
					#| error: true
 | 
				
			||||||
@@ -624,7 +624,7 @@ locations |>
 | 
				
			|||||||
  unnest_wider(location)
 | 
					  unnest_wider(location)
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Extracting the bounds requires a few more steps
 | 
					Extracting the bounds requires a few more steps:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
locations |> 
 | 
					locations |> 
 | 
				
			||||||
@@ -649,7 +649,7 @@ locations |>
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
Note how we unnest two columns simultaneously by supplying a vector of variable names to `unnest_wider()`.
 | 
					Note how we unnest two columns simultaneously by supplying a vector of variable names to `unnest_wider()`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This somewhere that `hoist()`, mentioned briefly above, can be useful.
 | 
					This is somewhere that `hoist()`, mentioned briefly above, can be useful.
 | 
				
			||||||
Once you've discovered the path to get to the components you're interested in, you can extract them directly using `hoist()`:
 | 
					Once you've discovered the path to get to the components you're interested in, you can extract them directly using `hoist()`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
@@ -711,17 +711,17 @@ Four of them are scalars:
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
-   The simplest type is a null, which is written `null`, which plays the same role as both `NULL` and `NA` in R. It represents the absence of data.
 | 
					-   The simplest type is a null, which is written `null`, which plays the same role as both `NULL` and `NA` in R. It represents the absence of data.
 | 
				
			||||||
-   A **string** is much like a string in R, but must use double quotes, not single quotes.
 | 
					-   A **string** is much like a string in R, but must use double quotes, not single quotes.
 | 
				
			||||||
-   A **number** is similar to R's numbers: they can be use integer (e.g. 123), decimal (e.g. 123.45), or scientific (e.g. 1.23e3) notation. JSON doesn't support Inf, -Inf, or NaN.
 | 
					-   A **number** is similar to R's numbers: they can be integer (e.g. 123), decimal (e.g. 123.45), or scientific (e.g. 1.23e3) notation. JSON doesn't support Inf, -Inf, or NaN.
 | 
				
			||||||
-   A **boolean** is similar to R's `TRUE` and `FALSE`, but use lower case `true` and `false`.
 | 
					-   A **boolean** is similar to R's `TRUE` and `FALSE`, but use lower case `true` and `false`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
JSON's strings, numbers, and booleans are pretty similar to R's character, numeric, and logical vectors.
 | 
					JSON's strings, numbers, and booleans are pretty similar to R's character, numeric, and logical vectors.
 | 
				
			||||||
The main difference is that JSON's scalars can only represent a single value.
 | 
					The main difference is that JSON's scalars can only represent a single value.
 | 
				
			||||||
To represent multiple values you need to use one of the two remaining two types, arrays and objects.
 | 
					To represent multiple values you need to use one of the two remaining types, arrays and objects.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Both arrays and objects are similar to lists in R; the difference is whether or not they're named.
 | 
					Both arrays and objects are similar to lists in R; the difference is whether or not they're named.
 | 
				
			||||||
An **array** is like an unnamed list, and is written with `[]`.
 | 
					An **array** is like an unnamed list, and is written with `[]`.
 | 
				
			||||||
For example `[1, 2, 3]` is an array containing 3 numbers, and `[null, 1, "string", false]` is an array that contains a null, a number, a string, and a boolean.
 | 
					For example `[1, 2, 3]` is an array containing 3 numbers, and `[null, 1, "string", false]` is an array that contains a null, a number, a string, and a boolean.
 | 
				
			||||||
An **object** is like a named list, and they're written with `{}`.
 | 
					An **object** is like a named list, and it's written with `{}`.
 | 
				
			||||||
For example, `{"x": 1, "y": 2}` is an object that maps `x` to 1 and `y` to 2.
 | 
					For example, `{"x": 1, "y": 2}` is an object that maps `x` to 1 and `y` to 2.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
### jsonlite
 | 
					### jsonlite
 | 
				
			||||||
@@ -729,7 +729,7 @@ For example, `{"x": 1, "y": 2}` is an object that maps `x` to 1 and `y` to 2.
 | 
				
			|||||||
To convert JSON into R data structures, we recommend that you use the jsonlite package, by Jeroen Oooms.
 | 
					To convert JSON into R data structures, we recommend that you use the jsonlite package, by Jeroen Oooms.
 | 
				
			||||||
We'll use only two jsonlite functions: `read_json()` and `parse_json()`.
 | 
					We'll use only two jsonlite functions: `read_json()` and `parse_json()`.
 | 
				
			||||||
In real life, you'll use `read_json()` to read a JSON file from disk.
 | 
					In real life, you'll use `read_json()` to read a JSON file from disk.
 | 
				
			||||||
For example, we the repurrsive package also provides the source for `gh_user` as a JSON file:
 | 
					For example, the repurrsive package also provides the source for `gh_user` as a JSON file:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
# A path to a json file inside the package:
 | 
					# A path to a json file inside the package:
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user