Add exercise on group_by (#1203)
* Add exercise on group_by * Don't eval the code chunks * Edits + indentation
This commit is contained in:
parent
29c8822d3b
commit
26a20c586a
|
@ -582,6 +582,88 @@ As you can see, when you summarize an ungrouped data frame, you get a single row
|
|||
5. Explain what `count()` does in terms of the dplyr verbs you just learn.
|
||||
What does the `sort` argument to `count()` do?
|
||||
|
||||
6. Suppose we have the following tiny data frame:
|
||||
|
||||
```{r}
|
||||
df <- tibble(
|
||||
x = 1:5,
|
||||
y = c("a", "b", "a", "a", "b"),
|
||||
z = c("K", "K", "L", "L", "K")
|
||||
)
|
||||
```
|
||||
|
||||
a. What does the following code do?
|
||||
Run it, analyze the result, and describe what `group_by()` does.
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
group_by(y)
|
||||
```
|
||||
|
||||
b. What does the following code do?
|
||||
Run it, analyze the result, and describe what `arrange()` does.
|
||||
Also comment on how it's different from the `group_by()` in part (a)?
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
arrange(y)
|
||||
```
|
||||
|
||||
c. What does the following code do?
|
||||
Run it, analyze the result, and describe what the pipeline does.
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
group_by(y) |>
|
||||
summarize(mean_x = mean(x))
|
||||
```
|
||||
|
||||
d. What does the following code do?
|
||||
Run it, analyze the result, and describe what the pipeline does.
|
||||
Then, comment on what the message says.
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
group_by(y, z) |>
|
||||
summarize(mean_x = mean(x))
|
||||
```
|
||||
|
||||
e. What does the following code do?
|
||||
Run it, analyze the result, and describe what the pipeline does.
|
||||
How is the output different from the one in part (d).
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
group_by(y, z) |>
|
||||
summarize(mean_x = mean(x), .groups = "drop")
|
||||
```
|
||||
|
||||
f. What do the following pipelines do?
|
||||
Run both, analyze the results, and describe what each pipeline does.
|
||||
How are the outputs of the two pipelines different?
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
||||
df |>
|
||||
group_by(y, z) |>
|
||||
summarize(mean_x = mean(x))
|
||||
|
||||
df |>
|
||||
group_by(y, z) |>
|
||||
mutate(mean_x = mean(x))
|
||||
```
|
||||
|
||||
## Case study: aggregates and sample size {#sec-sample-size}
|
||||
|
||||
Whenever you do any aggregation, it's always a good idea to include a count (`n()`).
|
||||
|
|
Loading…
Reference in New Issue