Move more from missing values; fix build failure
This commit is contained in:
		@@ -12,7 +12,8 @@ You'll find logical vectors directly in data relatively rarely, but despite that
 | 
			
		||||
 | 
			
		||||
We'll begin with the most common way of creating logical vectors: numeric comparisons.
 | 
			
		||||
Then we'll talk about using Boolean algebra to combine different logical vectors, and some useful summaries for logical vectors.
 | 
			
		||||
We'll finish off with some other tool for making conditional changes
 | 
			
		||||
We'll finish off with some other tool for making conditional changes.
 | 
			
		||||
Along the way, you'll also learn a little more about working with missing values, `NA`.
 | 
			
		||||
 | 
			
		||||
### Prerequisites
 | 
			
		||||
 | 
			
		||||
@@ -269,6 +270,10 @@ Similar reasoning applies with `NA & FALSE`.
 | 
			
		||||
### Exercises
 | 
			
		||||
 | 
			
		||||
1.  Find all flights where `arr_delay` is missing but `dep_delay` is not. Find all flights where neither `arr_time` nor `sched_arr_time` are missing, but `arr_delay` is.
 | 
			
		||||
2.  How many flights have a missing `dep_time`? What other variables are missing? What might these rows represent?
 | 
			
		||||
3.  How could you use `arrange()` to sort all missing values to the start? (Hint: use `!is.na()`).
 | 
			
		||||
4.  Come up with another approach that will give you the same output as `not_cancelled |> count(dest)` and `not_cancelled |> count(tailnum, wt = distance)` (without using `count()`).
 | 
			
		||||
5.  Look at the number of cancelled flights per day. Is there a pattern? Is the proportion of cancelled flights related to the average delay?
 | 
			
		||||
 | 
			
		||||
## Summaries
 | 
			
		||||
 | 
			
		||||
@@ -416,3 +421,4 @@ df |> filter(cumall(!(balance < 0)))
 | 
			
		||||
### 
 | 
			
		||||
 | 
			
		||||
## 
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -18,31 +18,6 @@ Missing topics:
 | 
			
		||||
 | 
			
		||||
-   `coalesce()` and `na_if()`
 | 
			
		||||
 | 
			
		||||
## Basics
 | 
			
		||||
 | 
			
		||||
### Missing values {#missing-values-filter}
 | 
			
		||||
 | 
			
		||||
If you want to determine if a value is missing, use `is.na()`:
 | 
			
		||||
 | 
			
		||||
```{r}
 | 
			
		||||
is.na(x)
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
### Exercises
 | 
			
		||||
 | 
			
		||||
1.  How many flights have a missing `dep_time`?
 | 
			
		||||
    What other variables are missing?
 | 
			
		||||
    What might these rows represent?
 | 
			
		||||
 | 
			
		||||
2.  How could you use `arrange()` to sort all missing values to the start?
 | 
			
		||||
    (Hint: use `!is.na()`).
 | 
			
		||||
 | 
			
		||||
3.  Come up with another approach that will give you the same output as `not_cancelled |> count(dest)` and `not_cancelled |> count(tailnum, wt = distance)` (without using `count()`).
 | 
			
		||||
 | 
			
		||||
4.  Look at the number of cancelled flights per day.
 | 
			
		||||
    Is there a pattern?
 | 
			
		||||
    Is the proportion of cancelled flights related to the average delay?
 | 
			
		||||
 | 
			
		||||
## Explicit vs implicit missing values {#missing-values-tidy}
 | 
			
		||||
 | 
			
		||||
Changing the representation of a dataset brings up an important subtlety of missing values.
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user