Remove modelling
- Move files to extras/ for now - Adjust references to modelling - Or add TO DO items to adjust later
This commit is contained in:
parent
f9109aadfe
commit
55803fc8a3
5
EDA.Rmd
5
EDA.Rmd
|
@ -623,6 +623,8 @@ It's possible to use a model to remove the very strong relationship between pric
|
|||
The following code fits a model that predicts `price` from `carat` and then computes the residuals (the difference between the predicted value and the actual value).
|
||||
The residuals give us a view of the price of the diamond, once the effect of carat has been removed.
|
||||
|
||||
<!--# TO DO: Replace modelr based workflow with tidymodels, as a sneak preview. -->
|
||||
|
||||
```{r, dev = "png"}
|
||||
library(modelr)
|
||||
|
||||
|
@ -643,8 +645,7 @@ ggplot(data = diamonds2) +
|
|||
geom_boxplot(mapping = aes(x = cut, y = resid))
|
||||
```
|
||||
|
||||
You'll learn how models, and the modelr package, work in the final part of the book, [model](#model-intro).
|
||||
We're saving modelling for later because understanding what models are and how they work is easiest once you have tools of data wrangling and programming in hand.
|
||||
We're not discussing modelling in this book because understanding what models are and how they work is easiest once you have tools of data wrangling and programming in hand.
|
||||
|
||||
## ggplot2 calls
|
||||
|
||||
|
|
|
@ -30,11 +30,6 @@ rmd_files: [
|
|||
"vectors.Rmd",
|
||||
"iteration.Rmd",
|
||||
|
||||
"model.Rmd",
|
||||
"model-basics.Rmd",
|
||||
"model-building.Rmd",
|
||||
"model-many.Rmd",
|
||||
|
||||
"communicate.Rmd",
|
||||
"rmarkdown.Rmd",
|
||||
"communicate-plots.Rmd",
|
||||
|
|
|
@ -99,6 +99,7 @@ ggplot(df, aes(x, y)) +
|
|||
|
||||
2. The `geom_smooth()` is somewhat misleading because the `hwy` for large engines is skewed upwards due to the inclusion of lightweight sports cars with big engines.
|
||||
Use your modelling tools to fit and display a better model.
|
||||
<!--# TO DO: Reconsider this exercise in light of removing modeling chapters. -->
|
||||
|
||||
3. Take an exploratory graphic that you've created in the last month, and add informative titles to make it easier for others to understand.
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
# Introduction {#communicate-intro}
|
||||
|
||||
So far, you've learned the tools to get your data into R, tidy it into a form convenient for analysis, and then understand your data through transformation, visualisation and modelling.
|
||||
So far, you've learned the tools to get your data into R, tidy it into a form convenient for analysis, and then understand your data through transformation, and visualisation.
|
||||
However, it doesn't matter how great your analysis is unless you can explain it to others: you need to **communicate** your results.
|
||||
|
||||
```{r echo = FALSE, out.width = "75%"}
|
||||
|
|
|
@ -20,7 +20,6 @@ In this part of the book you will learn some useful tools that have an immediate
|
|||
- Finally, in [exploratory data analysis], you'll combine visualisation and transformation with your curiosity and scepticism to ask and answer interesting questions about data.
|
||||
|
||||
Modelling is an important part of the exploratory process, but you don't have the skills to effectively learn or apply it yet.
|
||||
We'll come back to it in [modelling](#model-intro), once you're better equipped with more data wrangling and programming tools.
|
||||
|
||||
Nestled among these three chapters that teach you the tools of exploration are three chapters that focus on your R workflow.
|
||||
In [workflow: basics], [workflow: scripts], and [workflow: projects] you'll learn good practices for writing and organising your R code.
|
||||
|
|
|
@ -639,7 +639,7 @@ There are two alternatives:
|
|||
```
|
||||
|
||||
Feather tends to be faster than RDS and is usable outside of R.
|
||||
RDS supports list-columns (which you'll learn about in [many models]); feather currently does not.
|
||||
RDS supports list-columns (which you'll learn about in <!--# TO DO: Link to to-be-added list columns chapter. -->); feather currently does not.
|
||||
|
||||
```{r, include = FALSE}
|
||||
file.remove("challenge-2.csv")
|
||||
|
|
|
@ -14,7 +14,7 @@ documentclass: book
|
|||
# Welcome {.unnumbered}
|
||||
|
||||
<a href="http://amzn.to/2aHLAQ1"><img src="cover.png" alt="Buy from amazon" class="cover" width="250" height="375"/></a> This is the website for the work-in-progress 2nd edition of **"R for Data Science"**. This book will teach you how to do data science with R: You'll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it.
|
||||
In this book, you will find a practicum of skills for data science.
|
||||
<!--# TO DO: Should "model it" stay here? Omitted? Mentioned with an explanation as to where to go for modeling? --> In this book, you will find a practicum of skills for data science.
|
||||
Just as a chemist learns how to clean test tubes and stock a lab, you'll learn how to clean data and draw plots---and many other things besides.
|
||||
These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R.
|
||||
You'll learn how to use the grammar of graphics, literate programming, and reproducible research to save time.
|
||||
|
|
|
@ -140,7 +140,6 @@ Hypothesis confirmation is hard for two reasons:
|
|||
2. You can only use an observation once to confirm a hypothesis.
|
||||
As soon as you use it more than once you're back to doing exploratory analysis.
|
||||
This means to do hypothesis confirmation you need to "preregister" (write out in advance) your analysis plan, and not deviate from it even when you have seen the data.
|
||||
We'll talk a little about some strategies you can use to make this easier in [modelling](#model-intro).
|
||||
|
||||
It's common to think about modelling as a tool for hypothesis confirmation, and visualisation as a tool for hypothesis generation.
|
||||
But that's a false dichotomy: models are often used for exploration, and with a little care you can use visualisation for confirmation.
|
||||
|
|
|
@ -423,7 +423,7 @@ There's no way to list every possible function that you might use, but here's a
|
|||
|
||||
- Logs: `log()`, `log2()`, `log10()`.
|
||||
Logarithms are an incredibly useful transformation for dealing with data that ranges across multiple orders of magnitude.
|
||||
They also convert multiplicative relationships to additive, a feature we'll come back to in modelling.
|
||||
They also convert multiplicative relationships to additive.
|
||||
|
||||
All else being equal, I recommend using `log2()` because it's easy to interpret: a difference of 1 on the log scale corresponds to doubling on the original scale and a difference of -1 corresponds to halving.
|
||||
|
||||
|
|
Loading…
Reference in New Issue