Complete draft of lists chapter

This commit is contained in:
hadley
2015-12-08 10:11:53 -06:00
parent 27b148d9ea
commit afeae8396b
7 changed files with 119 additions and 92 deletions

View File

@@ -14,6 +14,16 @@ options(digits = 3)
* Bootstrapping to understand uncertainty in parameters.
* Cross-validation to understand predictive quality.
## Purrr + dplyr
i.e. how do dplyr and purrr intersect.
* Why use a data frame?
* List columns in a data frame
* Mutate & filter.
* Creating list columns with `group_by()` and `do()`.
## Multiple models
A natural application of `map2()` is handling test-training pairs when doing model evaluation. This is an important modelling technique: you should never evaluate a model on the same data it was fit to because it's going to make you overconfident. Instead, it's better to divide the data up and use one piece to fit the model and the other piece to evaluate it. A popular technique for this is called k-fold cross validation. You randomly hold out x% of the data and fit the model to the rest. You need to repeat this a few times because of random variation.