commit
cac428b199
8
tidy.Rmd
8
tidy.Rmd
|
@ -87,7 +87,7 @@ knitr::include_graphics("images/tidy-2.png")
|
||||||
|
|
||||||
*A data frame is a list of vectors that R displays as a table. When your data is tidy, the values of each variable fall in their own column vector.*
|
*A data frame is a list of vectors that R displays as a table. When your data is tidy, the values of each variable fall in their own column vector.*
|
||||||
|
|
||||||
As a result, you can extract the all of the values of a variable in a tidy data set by extracting the column vector that contains the variable. You can do this easily with R's list syntax, i.e.
|
As a result, you can extract all the values of a variable in a tidy data set by extracting the column vector that contains the variable. You can do this easily with R's list syntax, i.e.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
table1$cases
|
table1$cases
|
||||||
|
@ -247,7 +247,7 @@ Every cell in a table of data contains one half of a key value pair, as does eve
|
||||||
table2
|
table2
|
||||||
```
|
```
|
||||||
|
|
||||||
In `table2`, the `key` column contains only keys (and not just because the column is labelled `key`). Conveniently, the `value` column contains the values associated with those keys.
|
In `table2`, the `key` column contains only keys (and not just because the column is labeled `key`). Conveniently, the `value` column contains the values associated with those keys.
|
||||||
|
|
||||||
You can use the `spread()` function to tidy this layout.
|
You can use the `spread()` function to tidy this layout.
|
||||||
|
|
||||||
|
@ -269,7 +269,7 @@ knitr::include_graphics("images/tidy-8.png")
|
||||||
|
|
||||||
*`spread()` distributes a pair of key:value columns into a field of cells. The unique keys in the key column become the column names of the field of cells.*
|
*`spread()` distributes a pair of key:value columns into a field of cells. The unique keys in the key column become the column names of the field of cells.*
|
||||||
|
|
||||||
You can see that `spread()` maintains each of the relationships expressed in the original data set. The output contains the four original variables, *country*, *year*, *population*, and *cases*, and the values of these variables are grouped according to the orginal observations. As a bonus, now the layout of these relationships is tidy.
|
You can see that `spread()` maintains each of the relationships expressed in the original data set. The output contains the four original variables, *country*, *year*, *population*, and *cases*, and the values of these variables are grouped according to the original observations. As a bonus, now the layout of these relationships is tidy.
|
||||||
|
|
||||||
`spread()` takes three optional arguments in addition to `data`, `key`, and `value`:
|
`spread()` takes three optional arguments in addition to `data`, `key`, and `value`:
|
||||||
|
|
||||||
|
@ -367,7 +367,7 @@ You can also pass an integer or vector of integers to `sep`. `separate()` will i
|
||||||
separate(table3, year, into = c("century", "year"), sep = 2)
|
separate(table3, year, into = c("century", "year"), sep = 2)
|
||||||
```
|
```
|
||||||
|
|
||||||
You can futher customize `separate()` with the `remove`, `convert`, and `extra` arguments:
|
You can further customize `separate()` with the `remove`, `convert`, and `extra` arguments:
|
||||||
|
|
||||||
- **`remove`** - Set `remove = FALSE` to retain the column of values that were separated in the final data frame.
|
- **`remove`** - Set `remove = FALSE` to retain the column of values that were separated in the final data frame.
|
||||||
- **`convert`** - By default, `separate()` will return new columns as character columns. Set `convert = TRUE` to convert new columns to double (numeric), integer, logical, complex, and factor columns with `type.convert()`.
|
- **`convert`** - By default, `separate()` will return new columns as character columns. Set `convert = TRUE` to convert new columns to double (numeric), integer, logical, complex, and factor columns with `type.convert()`.
|
||||||
|
|
Loading…
Reference in New Issue