parent
2afd79c1f6
commit
0f956d64db
14
factors.Rmd
14
factors.Rmd
|
@ -144,7 +144,7 @@ When working with factors, the two most common operations are changing the order
|
||||||
It's often useful to change the order of the factor levels in a visualisation. For example, imagine you want to explore the average number of hours spent watching TV per day across religions:
|
It's often useful to change the order of the factor levels in a visualisation. For example, imagine you want to explore the average number of hours spent watching TV per day across religions:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
relig <- gss_cat %>%
|
relig_summary <- gss_cat %>%
|
||||||
group_by(relig) %>%
|
group_by(relig) %>%
|
||||||
summarise(
|
summarise(
|
||||||
age = mean(age, na.rm = TRUE),
|
age = mean(age, na.rm = TRUE),
|
||||||
|
@ -152,7 +152,7 @@ relig <- gss_cat %>%
|
||||||
n = n()
|
n = n()
|
||||||
)
|
)
|
||||||
|
|
||||||
ggplot(relig, aes(tvhours, relig)) + geom_point()
|
ggplot(relig_summary, aes(tvhours, relig)) + geom_point()
|
||||||
```
|
```
|
||||||
|
|
||||||
It is difficult to interpret this plot because there's no overall pattern. We can improve it by reordering the levels of `relig` using `fct_reorder()`. `fct_reorder()` takes three arguments:
|
It is difficult to interpret this plot because there's no overall pattern. We can improve it by reordering the levels of `relig` using `fct_reorder()`. `fct_reorder()` takes three arguments:
|
||||||
|
@ -163,7 +163,7 @@ It is difficult to interpret this plot because there's no overall pattern. We ca
|
||||||
`x` for each value of `f`. The default value is `median`.
|
`x` for each value of `f`. The default value is `median`.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
ggplot(relig, aes(tvhours, fct_reorder(relig, tvhours))) +
|
ggplot(relig_summary, aes(tvhours, fct_reorder(relig, tvhours))) +
|
||||||
geom_point()
|
geom_point()
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -172,7 +172,7 @@ Reordering religion makes it much easier to see that people in the "Don't know"
|
||||||
As you start making more complicated transformations, I'd recommend moving them out of `aes()` and into a separate `mutate()` step. For example, you could rewrite the plot above as:
|
As you start making more complicated transformations, I'd recommend moving them out of `aes()` and into a separate `mutate()` step. For example, you could rewrite the plot above as:
|
||||||
|
|
||||||
```{r, eval = FALSE}
|
```{r, eval = FALSE}
|
||||||
relig %>%
|
relig_summary %>%
|
||||||
mutate(relig = fct_reorder(relig, tvhours)) %>%
|
mutate(relig = fct_reorder(relig, tvhours)) %>%
|
||||||
ggplot(aes(tvhours, relig)) +
|
ggplot(aes(tvhours, relig)) +
|
||||||
geom_point()
|
geom_point()
|
||||||
|
@ -180,7 +180,7 @@ relig %>%
|
||||||
What if we create a similar plot looking at how average age varies across reported income level?
|
What if we create a similar plot looking at how average age varies across reported income level?
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
rincome <- gss_cat %>%
|
rincome_summary <- gss_cat %>%
|
||||||
group_by(rincome) %>%
|
group_by(rincome) %>%
|
||||||
summarise(
|
summarise(
|
||||||
age = mean(age, na.rm = TRUE),
|
age = mean(age, na.rm = TRUE),
|
||||||
|
@ -188,7 +188,7 @@ rincome <- gss_cat %>%
|
||||||
n = n()
|
n = n()
|
||||||
)
|
)
|
||||||
|
|
||||||
ggplot(rincome, aes(age, fct_reorder(rincome, age))) + geom_point()
|
ggplot(rincome_summary, aes(age, fct_reorder(rincome, age))) + geom_point()
|
||||||
```
|
```
|
||||||
|
|
||||||
Here, arbitrarily reordering the levels isn't a good idea! That's because `rincome` already has a principled order that we shouldn't mess with. Reserve `fct_reorder()` for factors whose levels are arbitrarily ordered.
|
Here, arbitrarily reordering the levels isn't a good idea! That's because `rincome` already has a principled order that we shouldn't mess with. Reserve `fct_reorder()` for factors whose levels are arbitrarily ordered.
|
||||||
|
@ -196,7 +196,7 @@ Here, arbitrarily reordering the levels isn't a good idea! That's because `rinco
|
||||||
However, it does make sense to pull "Not applicable" to the front with the other special levels. You can use `fct_relevel()`. It takes a factor, `f`, and then any number of levels that you want to move to the front of the line.
|
However, it does make sense to pull "Not applicable" to the front with the other special levels. You can use `fct_relevel()`. It takes a factor, `f`, and then any number of levels that you want to move to the front of the line.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
ggplot(rincome, aes(age, fct_relevel(rincome, "Not applicable"))) +
|
ggplot(rincome_summary, aes(age, fct_relevel(rincome, "Not applicable"))) +
|
||||||
geom_point()
|
geom_point()
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue