Use na.warn
This commit is contained in:
parent
8edc9a8768
commit
914192f1e9
|
@ -43,12 +43,12 @@ The goal of a model is not to uncover truth, but to discover a simple approximat
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
|
|
||||||
We need a couple of packages specifically designed for modelling, and all the packages you've used before for EDA.
|
We need a couple of packages specifically designed for modelling, and all the packages you've used before for EDA.
|
||||||
|
|
||||||
```{r setup, message = FALSE}
|
```{r setup, message = FALSE}
|
||||||
# Modelling functions
|
# Modelling functions
|
||||||
library(modelr)
|
library(modelr)
|
||||||
library(broom)
|
options(na.action = na.warn)
|
||||||
|
|
||||||
# EDA tools
|
# EDA tools
|
||||||
library(ggplot2)
|
library(ggplot2)
|
||||||
|
@ -661,7 +661,7 @@ sim6 %>%
|
||||||
|
|
||||||
## Missing values
|
## Missing values
|
||||||
|
|
||||||
Missing values obviously can not convey any information about the relationship between the variables, so modelling functions will silently drop any rows that contain missing values:
|
Missing values obviously can not convey any information about the relationship between the variables, so modelling functions will drop any rows that contain missing values. R's default behaviour is to silently drop them, but `options(na.action = na.warn)` (run in the prerequisites), makes sure you get a warning.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
df <- tibble::frame_data(
|
df <- tibble::frame_data(
|
||||||
|
@ -676,11 +676,16 @@ df <- tibble::frame_data(
|
||||||
mod <- lm(y ~ x, data = df)
|
mod <- lm(y ~ x, data = df)
|
||||||
```
|
```
|
||||||
|
|
||||||
Unfortunately this is one of the rare cases in R where missing values will go silently missing without any warning. You can spot their absence by comparing the number of rows in the data frame with the number of observations used by the model:
|
To suppress the warning, set `na.action = na.exclude`:
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
mod <- lm(y ~ x, data = df, na.action = na.exclude)
|
||||||
|
```
|
||||||
|
|
||||||
|
You can always see exactly how many observations were used with `nobs()`:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
nobs(mod)
|
nobs(mod)
|
||||||
nrow(df)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Other model families
|
## Other model families
|
||||||
|
|
Loading…
Reference in New Issue